The CSTools (v4.0) Toolbox: from Climate Forecasts to Climate Forecast Information

Despite the wealth of existing climate forecast data, only a small part is effectively exploited for sectoral applications. A major cause of this is the lack of integrated tools that allow the translation of data into useful and skilful climate information. This barrier is addressed through the development of an R package. CSTools is an easy-to-use toolbox designed and built to assess and improve the quality of climate forecasts for seasonal to multi–annual scales. The package 20 contains process-based state-of-the-art methods for forecast calibration, bias correction, statistical and stochastic downscaling, optimal forecast combination and multivariate verification, as well as basic and advanced tools to obtain tailored products. Due to the design of the toolbox in individual functions, the users can develop their own post-processing chain of functions as shown in the use cases presented in this manuscript: the analysis of an extreme wind speed event, the generation of seasonal forecasts of snow depth based on the SNOWPACK model and the post-processing of data to be used 25 as input for the SCHEME hydrological model.

climate variability and to guide well-informed decision making. In fact, there is a strong need and interest in a wide range of application sectors for reliable seasonal to decadal forecasts (White et al., 2017). Moreover, the generation of products adjusted to user needs tends to be costly in time and resources. Software tools facilitate this process, by sharing common and state-of-the-art methods with actors in the climate community and beyond (National Academies of Sciences, Engineering 45 and Medicine, 2016).
In this context, a Climate Services Toolbox (CSTools) has been developed to address this needs. The toolbox was designed to include functions for each of the main post-processing steps of seasonal forecast, but the methods are also suitable for subseasonal and decadal predictions. These forecasts are typically generated by running a forecast system several times using 50 perturbations on the initial conditions and model physics (ECMWF, 2017). Each simulation is then considered a member of the ensemble. These similarities in setups among forecasts of different time horizons generally lead to common requirements in their post-processing steps (Palmer et al., 2008). Such ensembles are generated to account for initial condition and model uncertainty, to make probabilistic statements about the most likely atmospheric state (ECMWF, 2017) and to inform sensitivity studies. However, additional post-processing steps are required to translate the simulations into climate 55 information.
CSTools is targeted primarily at applied climate scientists or climate services developers that require the use of high-quality climate data (e.g.: high-resolution data obtained by applying downscaling methods). These users can handle the tool by themselves, understanding each of the methodologies given the provided documentation and with the support of scientific 60 research publications. The tool is fully transparent since it is open-source, allowing the user to control, understand and even adapt every step of the analysis in depth. While simple examples are given in the package documentation, this manuscript aims to showcase the usefulness of CSTools in the context of advanced state-of-the-art use cases.

From climate data to climate information
There are different forecast post-processing steps necessary to translate climate data into climate information. These steps 65 will vary depending on the applications, but usually fall within the following categories (as illustrated on Fig. 1): • Data retrieval and formatting: Optimal methods for spatial and temporal data manipulation, such as interpolation methods, are needed given the wide range of climate data formats. This can be a labour intensive step when trying to combine multiple datasets such as observations and forecasts from multiple systems.
• Correction methods for forecast calibration: Calibration is necessary to correct systematic errors, uncover any 70 predictive signal and adjust forecasts to the observational statistical properties in order to be integrated into impact models. These biases originate from the approximate representation of unresolved climate processes in the forecast systems (Marcos, 2016;Van Schaeybroeck and Vannitsem, 2019;Manzanas et al., 2019).
• Classification methods for multi-model forecast combination or scenario selection: Combining multiple forecasting systems allows to substantially enlarge the diversity of potential weather situations (Hemri et al., 2020), errors are 75 partially compensated and there is an increase in consistency and reliability . Scenario selection, on the other hand, may often be useful for communication and information synthesis for specific applications (Ferranti and Corti, 2011).
• Downscaling: Climate forecast systems, due to computational limitations, typically provide global seasonal-todecadal forecasts at a horizontal resolution of ~100 km. Users, however, require information at a finer scale. As 80 such, statistical and stochastic downscaling techniques are commonly used to perform realistic transformations from large to small scales Widmann, 2018, Ramon et al. 2021). https://doi.org/10.5194/gmd-2021-368 Preprint. Discussion started: 6 December 2021 c Author(s) 2021. CC BY 4.0 License.
• Skill assessment: Estimating the quality of the predictions is essential to understand the limitations of the simulations, to improve the current forecast systems and to provide useful forecast products tailored to several sectors (Merryfield et al., 2020). Skill estimates should be provided together with the forecast products to allow a 85 correct interpretation of the forecasts or the added value of a system with respect to a benchmark.
• Visualization: From the climate services perspective, visualization tools are essential to illustrate different aspects of deterministic or probabilistic climate information.
Methods for each of these post-processing steps are provided within CSTools. Several software packages are already available to analyse different types of climate data. For instance, the Earth System 95 Model Validation Tool (ESMValTool; Eyring et al., 2016b;Eyring et al., 2020;Righi et al. 2020) was designed to facilitate the analysis of climate projections produced in the context of the Coupled Model Intercomparison Project (CMIP; Eyring et al., 2016a). The R packages s2dverification (Manubens et al., 2018), SpecsVerification (Siegert, 2017) and easyVerification (MeteoSwiss, 2017) or the python package climpred (Brady and Spring, 2021) focus on skill assessment of ensemble forecasts. Climate4R (Iturbide et al., 2019) is an R-bsed framework for climate data post-processing including different 100 methods. The main purpose of these different packages is the facilitation of research. CSTools, on the other hand, targets scientists interested in providing a climate product to some final users. CSTools could nonetheless be useful to research scientists, as it is made compatible some of the aforementioned R packages.
For a detailed description of CSTools functions and parameters, the reference manual is attached to the package and 105 available at https://CRAN.R-project.org/package=CSTools in the standardized format of an R package documentation. In this manuscript, an overview of the methods and documentation gathered in CSTools is presented in Sect. 2, while the creation of a tailored dataset is shown in Sect. 3. Three case studies based on the analysis of an extreme wind speed event, the snow model SNOWPACK (Lehning et al., 2002a,b https://models.slf.ch/p/snowpack/) and data preparation for the hydrological model SCHEME (Baguis et al. 2010) show the usefulness of the toolbox. Section 4 concludes this paper and 110 discusses some future developments for the package. https://doi.org/10.5194/gmd-2021-368 Preprint. Discussion started: 6 December 2021 c Author(s) 2021. CC BY 4.0 License.

CSTools: overview
CSTools was created as part of a collaborative effort between six European institutions. Given the total number of contributors and collaborators (31 in version 4.0), compiling all methods into a software package using the R statistical programming language (R Core Team, 2017) was considered the most suitable and versatile option. Creating an R package 115 allows the inclusion of multiple tools ranging from complex statistical and climatological methods to visualization tools in the same framework. Moreover, CSTools is open-source, thus allowing users and developers alike to benefit from lower costs and software flexibility, quality and reliability (Information Resources Management Association, 2013). At the same time, CSTools can be integrated into other softwares in order to take advantage of its functionalities, as does, for instance, the S2S4E Decision Support Tool (https://s2s4e-dst.bsc.es). 120 CSTools was developed following common guidelines agreed upon by all contributors, including conventions for adding new functionalities, and taking into account software development best practices such as the use of a Version Control System (i.e. git; Chacon and Straub, 2014), and testing with continuous integration. The use of these development guidelines has resulted in a clean and homogeneous application programming interface (API). 125 Most functionalities exposed to the users can be invoked and applied to complex user datasets with a single function call. For example, in order to apply a given functionality named "Func", the user would write: The "CST_*" family of functions ingest and return objects of type "s2dv_cube" (see details in Sect. 2.1), thus allowing 130 compatibility between each functions and long post-processing chains to be created.

Technical aspects of CSTools
The CSTools development guidelines have been designed to maximise compatibility with other libraries such as s2dverification, s2dv and startR, all of them designed to operate fundamentally with multi-dimensional arrays with named dimensions. Because of this design, the CSTools user is able to perform basic array inquiry on the "data" element of the 135 "s2dv_cube" objects at any point in the workflow in order to check the dimensions of the data or to find the number of members, start dates or forecast lead times analysed. Internally, each of these high-level "CST_*" functions perform two nested calls to two other different but closely related functions in the package. For example, a given functionality named "Func" would involve the following function calls: At the most fundamental level of this nested call structure, there is a call to a basic function (e.g. ".Func") that is designed to work with the least complex data structure possible (be it a single vector, a couple of vectors, an array and a vector, …). At 150 the second level is a call to a wrapper function (e.g. "Func") around the basic function, which leverages the multiApply package (BSC-CNS et al., 2019) to extend the computation of ".Func" to inputs with any number of dimensions. The top-https://doi.org/10.5194/gmd-2021-368 Preprint. Discussion started: 6 December 2021 c Author(s) 2021. CC BY 4.0 License. level "CST_*" function is an additional wrapper function which adapts the second-level array-based function to work with "s2dv_cube" objects.

155
This nested structure has a number of benefits: • The CSTools user code, using top-level functions, is modular, concise, readable and easy to maintain.
• The R community can easily employ it via the array-compatible low-level functions.
• Multi-core parallelism is straightforward to exploit via middle-level functions and high-level "CST_*" functions.
• In cases where the data to process is larger than the RAM memory in the workstation or the computation is very 160 expensive, the low-level functions can be used together with the startR package (BSC-CNS and Manubens, 2020) to leverage HPC platforms and distribute workload in small chunks.
The modular aspect of the "CST_*" functions makes it straightforward for users to create their own post-processing workflows, as shown in Sect. 3. Metadata is propagated and expanded all along the workflows.

Methods in CSTools
Given that the methods included in CSTools are split into functions, the users can concatenate them to define their own postprocessing workflow. This design provides flexibility allowing the users to assess the impact of the various post-processing steps by modifying the chain of functions. The users can also select a single function and apply it outside of the CSTools workflow. The functions included in the package cover fundamental loading and transformation requirements, downscaling 170 tools, methods for correcting and evaluating forecast and advanced visualization tools (see Table 1). All functions are documented in a standard reference manual on the CRAN website (https://CRAN.R-project.org/package=CSTools). The documentation also includes vignettes, which are self-contained pieces of documentation combining code, text and images, describing some of the methodologies included in CSTools, as well as information on how to use the package to conduct specific analysis. 175 Table 1. Summary of the functions and methods by category. Prefix "CST_" refers to functions working on a specific object class called "s2dv_cube". Asterisk indicates functions that are used in vignettes.

180
CSTools builds on the experience gained from the development of other R packages for climate data analysis, such as s2dverification. Specifically, CSTools has adopted its "s2dv_cube" object as a central data structure to represent and carry data and metadata across function calls. A "s2dv_cube" object is essentially a R list which includes a multi-dimensional array of climate data originating from either observations or forecasts, and metadata, such as time period or region covered, dataset and variable name, and units. The development guidelines define conventions to ensure "s2dv_cube" objects are used 185 in a coherent way throughout the package.
CSTools has a single but powerful function to retrieve data from netCDF files called CST_Load. This function is a wrapper of the s2dverification Load function which allows reading monthly or daily data from a set of specified forecast datasets together with specified date-corresponding observations (Manubens et al., 2018). This function makes use of the Climate 190 Data Operators software (CDO; Schulzweida, 2019) to automatically interpolates all the data onto a common grid. Three samples of "s2dv_cube" objects created from using CST_Load are provided along with the package: area_average, with forecast and observational climate data averaged over a region; lonlat_data and lonlat_prec containing forecast and observational climate data for temperature and precipitation.

195
Although datasets can be retrieved from OPeNDAP URLs with NetCDF files, in general, the datasets have to be downloaded onto a local repository and formatted to comply with the CST_Load requirements. Observational reference datasets are stored in a folder in separate monthly NetCDF files (other formats are also possible; see https://earth.bsc.es/gitlab/es/s2dverification/-/blob/master/vignettes/data_retrieval.md for more information), while seasonal For users who retrieve data by other means (e.g. using the library ncdf4; Pierce, 2019), the CSTools package contains two 205 functions to convert data to a "s2dv_cube" object. If the data and metadata have been loaded in separate objects, they can be merged into a "s2dv_cube" object with the function s2dv_cube. On the other hand, if the data and metadata have been loaded into a single object, it can be transformed into class "s2dv_cube" class with the as.s2dv_cube function.
One of the capabilities of CSTools is to create a new dataset after, for example, the data has been downscaled and/or 210 calibrated. In that case, the user may need to save the new dataset into files to be shared among other users or its community.
Therefore, the package comes with a saving function called CST_SaveExp which creates netCDF files in a directory set by the user and which can be loaded again with the CST_Load function. Moreover, the climatological essential steps of computing anomalies can be done with CST_Anomaly which is a wrapper function of s2dverification methods that also allows computing smoothed climatologies.

215
The functions CST_MergeDims and CST_SplitDims provide additional flexibility to manipulate "s2dv_cube" objects. For instance, it is commonly required to split the time dimension of annual data into two dimensions, one identifying the season and the other the month of that season. On the contrary, some advanced classification methods may need to merge the latitudinal and longitudinal coordinates in a single dimension.

Classification methods
Classification methods are widely used in climatology to summarize the climatological conditions captured by observations or simulations. Sokal (1966) was already using sophisticated univariate and multivariate climatic classification systems to be generated from enormous data bases (Balling, 1984). However, the functions included in CSTools for this purpose are modern methods adapted to observations, reanalyses and climate model outputs with multiple ensemble members.

225
The CST_MultiEOFs function allows conducting Empirical Orthogonal Functions (EOF) analysis simultaneously over multiple variables. Based on singular value decomposition, the EOF analysis is applied over the region of interest (for example the Mediterranean region) in order to define, for each of the N variables chosen, a reduced phase space based on the leading modes of variability. A simultaneous analysis of these fields is then carried out with a (multivariate) EOF analysis in 230 the subspace spanned by the leading EOFs of each field. This produces a N-variable EOF picture of the variability in the region. The associated principal components can represent multi-variable indices that can be used to verify the forecast.
CST_WeatherRegimes and CST_RegimesAssign are complementary functions to derive weather regimes Torralba, 2021). The first function computes a set of weather regimes using a cluster analysis. The dimensionality of 235 this object can also be reduced by using PCs obtained from the application of the EOF analysis to filter the dataset, while the cluster analysis can be performed with the traditional k-means or hierarchical clustering. On the other hand, CST_RegimesAssign matches anomalies to a set of reference maps obtained using CST_WeatherRegimes. The anomalies are assigned to the most similar reference map using either the minimum Euclidean distance or the highest spatial correlation, which can be particularly useful to classify the predictions according to the clusters identified in the observational reference.

240
CST_CategoricalEnsCombination converts a multi-model ensemble forecast into a categorical forecast by giving the probability for each category. Different methods are available to combine the different ensemble forecasting models into probabilistic categorical forecasts. The amount of categories can be changed and are taken as the climatological quantiles (e.g. terciles), extracted from the observational data. The available methods are: "pool" for ensemble pooling where all 245 ensemble members of all forecast systems are weighted equally; "comb" for a model combination where each model system is weighted equally; and "mmw" for model weighting. The model weighting method is described in Rajagopalan et al. (2002), Robertson et al. (2004) and Van Schaeybroeck and Vannitsem (2019). More specifically, this method uses different weights for the occurrence probability predicted by the available models and by a climatological model and optimizes the weights by minimizing the ignorance score.

250
CST_EnsClustering is a cluster analysis tool, based on the k-means algorithm, for ensemble predictions. The aim is to group ensemble members according to similar characteristics and to select the most representative member for each cluster. The user chooses which feature of the data is used to group the ensemble members by clustering (e.g. temporal mean). The anomaly is computed with respect to the ensemble members and the EOF analysis is applied to these anomaly maps. After 255 reducing dimensionality via EOF analysis, k-means analysis is applied using the desired subset of PCs. The user can choose how many Principal Components (PCs) to retain or the percentage of explained variance to keep for the EOF analysis.

Downscaling methods
Downscaling is designed to increase the resolution of a dataset. In a climate service chain, downscaling is a fundamental step to transform the climate simulations from their coarse resolution to the finer resolution required by many final users studying

270
CST_RainFARM implements a stochastic downscaling technique and represents a so-called full-field weather generator.
More specifically, this function generates synthetic fine-scale precipitation fields whose statistical properties are consistent with the small-scale statistics of observed precipitation, while preserving the properties of the large-scale precipitation field.
The Rainfall Filtered Autoregressive Model (RainFARM; Rebora et al. 2006a,b) is based on the nonlinear transformation of a linearly correlated stochastic field generated by small-scale extrapolation of the Fourier spectrum of a large-scale 275 precipitation field. Developed originally for downscaling data at weather timescales, the method has been adapted for downscaling at climate timescales by D'Onofrio et al. (2014) and recently improved for regions with complex orography (Terzago et al., 2018). This methodology relies on two distinct functions to compute weights from high-resolution climatologies (CST_RFWeights) and the spatial-spectral slope used to extrapolate the Fourier spectrum to the unresolved scales (CST_RFSlope).

280
CST_RFTemp implements a simple lapse rate correction to a near-surface temperature field to account for changes in orography between a low and high resolution gridded dataset.
ADAMONT (ADAptation of RCM outputs to MOuNTain regions; Verfaillie et al., 2017) is a downscaling method designed 285 to adjust forecasts of daily variables. The method is based on the quantile mapping approach and originally relied on a regional reanalysis of hourly meteorological conditions. Two functions to implement ADAMONT have been included in CSTools. CST_AdamontQQcor computes a quantile mapping based on weather types for forecast data while CST_AdamontAnalog uses these weather types to find analogous data in the reference dataset.

290
The CST_AnalogsPredictors function downscales precipitation or maximum/minimum temperature low resolution forecast output data, in a domain centred over Iberian Peninsula, through the association with an observational high resolution dataset (Peral García et al., 2017) and a collection of predictors and reference synoptic situations similar to the estimated day. As a first step, a partner function AnalogsPredictors_train must be run to compare the large-scale atmospheric circulation to each of the atmospheric configurations from a reference period. The most similar days, defined by the Euclidean distance of 295 winds, are chosen as their analogs.

Correction methods
Correction methods can improve the quality of simulations by reducing the systematic errors that are present in the forecast due to model deficiencies. The periodicity of modes of variability (i.e. space-time patterns that tend to recur in the observed record) can also be exploited to improve the forecast skill.

300
Best Estimate Index (BEI) is a methodology that can be used to improve the forecast skill when a relationship exists between a climatological index and a given climate variable as shown in Sánchez-García et al. (2019) BEI_Weights provides the weights to correct a forecast system and CST_BEI_Weighting computes the ensemble mean or the tercile probabilities considering the weights returned by BEI_Weights.
Calibration can be considered as a way of obtaining predictions with average statistical properties similar to those of a reference data set. CST_Calibration performs the correction on the forecast systems' simulations using five different 310 member-by-member methodologies: the "bias" method corrects the mean bias only, the "evmos" method applies a variance inflation technique to ensure the correction of the mean and the correspondence of variance between forecasts and observations (Van Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods "mse_min" and "crps_min" correct the bias, the overall forecast variance and the ensemble spread as described in Doblas-Reyes et al. (2005) and Van Schaeybroeck and Vannitsem (2015), respectively. While the "mse_min" method minimizes a constrained mean-squared 315 error using three parameters, the "crps_min" method features four parameters and minimizes the Continuous Ranked Probability Score (CRPS). The "rpc-based" method adjusts the forecast variance to ensure that the ratio of predictable components (RPC) is equal to one (Eade et al., 2014). The function allows the five calibration methods to be performed in leave-one-out cross-validation mode, which means that the observed value of the year that is being corrected is not considered in the calibration, as it would be the case for real-time forecasts , Torralba et al., 2017.

320
The use of cross-validation is particularly important in order to avoid overestimating the skill when the hindcasts are calibrated. CST_BiasCorrection performs the same analysis as CST_Calibration using the "evmos" method but allowing to calibrate either a hindcast or forecast.
CST_QuantileMapping performs a quantile mapping adjustment by matching the probability distribution of a forecast with 325 the probability distribution of a set of observations. The function in CSTools calculates the relation between a set of past forecasts (i.e. hindcasts) and observations and applies the correction to the hindcast itself or to a different forecast. This function relies on the R package qmap (Gudmundsson et al., 2012;Gudmundsson, 2016). The user can set several parameters to define the distance between quantiles when adjusting the distribution, or the sample length in cases when the user wants to split the temporal dimension to apply separate adjustments.

330
CST_DynBiasCorrection relies on the dynamical state of the system to correct the systematic errors rather than on its statistical properties. This method uses two dynamical system metrics to correct the bias of each ensemble member: the local (in phase space) dimension and the persistence. In simple terms, they describe the recurrences of a system around a state in phase space. Dimension provides information on how the system can reach a state and how it can evolve from it. Thus, 335 dimension is a proxy for the system's active number of degrees of freedom. A very persistent state is typically highly predictable, while a very unstable state yields low persistence (Faranda et al., 2017;Faranda et al., 2019). The functions CST_ProxiesAttractor (to compute local dimension d and inverse of persistence theta) and Predictability (to compute scores of predictability based on the dynamical indicators resulting from CST_ProxiesAttractor) are internally used by CST_DynBiasCorrection and they are also exposed for users interested in interpreting the method's intermediate results.

Verification functions
Verification is not the main objective of this package. For that purpose, we refer users to other R packages such as s2dverification, SpecsVerification and easyVerification. However, in order to facilitate the evaluation of the forecasts, some basic metrics have been included. CST_MultiMetric calculates correlation, root mean square error and the root mean square error skill score for individual models and multi-model mean (if desired) with the observations (Mishra et al., 2019) and the ranked probability skill score (RPSS) based on terciles.

350
CST_MultivarRMSE calculates the RMSE using multiple variables, as the mean of each variable's RMSE scaled by its observed standard deviation. Variables can also be weighted based on their relative importance (as defined by the user).

Visualization
Some of the most requested functionalities in climate services are data visualization tools that allow presenting large quantities of information in an intuitive way. All the visualization functions in CSTools can be customized by modifying 355 result in a pop-up window.
PlotCombinedMap combines multiple 2-dimensional datasets into a single map based on a decision function. In other words, several "maps" are provided as input, and for each "map" the function creates a colour legend. A decision function is used at 360 each gridpoint to choose the value to be displayed, in the process retaining the information of which "map" it belongs to. For instance, multiple model skills could be compared in a region to visualize which is the best model in each region (Mishra et al., 2019;Torralba et al., 2021). Other applications, such as comparing multiple variables, are also possible.
PlotMostLikelyQuantileMap allows visualizing different probabilities easily. It receives as main input (via the parameter 365 "probs") a collection of longitude-latitude maps, each containing the probabilities (from 0 to 1) of the different grid cells belonging to a category: terciles, quantiles, or others (Lledó et al., 2020a;Torralba, 2019). The function plots the probability for the category with the maximum probability in each grid point.
PlotForecastPDF plots the probability distribution function of several ensemble forecasts in separate panels. By default, the 370 function plots the ensemble members, the estimated density distributions and the tercile probabilities. Probabilities for extreme categories, above (below) the 90th (10th) percentile (from now on, P90 (P10)), and observed values can also be included. This function is useful to compare changes in forecasts with different lead times . A comparison between forecasts from different models, different modes of variability (Lledó et al., 2020a)

380
It can sometimes be useful to present tabular results as colours instead of numbers. For this purpose, PlotTriangles4Categories converts a 3-dimensional numerical data array into a coloured grid with triangles. This function can be used to quickly compare modes of variability, skill metrics, differences between methods or forecast systems as a function of the lead times or seasons (Torralba, 2019;Verfaillie et al., 2021;Lledó et al., 2020b).

385
Examples of these visualisation tools as well as other functions of the package are shown through the two example case studies provided in the next section.

Use Cases
In order to demonstrate how CSTools can be used to provide climate information to potential users, we present three case studies which rely on CSTools for data post-processing. The first case study assesses whether seasonal forecasts could 390 anticipate the very strong near-surface winds over the Iberian Peninsula in March 2018 such as to provide useful information to the energy sector. In the second case, precipitation seasonal forecasts are post-processed following the requirements to use them to drive model of snowpack depth in high mountain sites. Finally, we provide an example of how seasonal forecasts of rainfall and near-surface temperature can be post-processed to drive a hydrological model. Because of its strong impact on the market, there is a lot of interest in the energy sector to anticipate this type of events.

410
The use case presented here shows whether the 2018 event had been anticipated by ECMWF System 5 a few months in advance. We note that the code could be adapted to other regions, time periods and variables and a detailed description of the code is provided below for users interested in modifying the necessary parameters. First, the seasonal forecasts initialized in

430
For both the hindcasts and the forecasts, we use monthly means of 10 m wind speed from the ECMWF SEAS5 system, obtained from C3S (SEAS5) at 1° spatial resolution. For the observational reference, we use monthly mean 100 m wind speeds from the ERA5 reanalysis (Hersbach et al., 2020) at 0.25° (around 30 km) spatial resolution. Winds at 100 m height are of relevance for energy applications and, although this variable is not available directly from the seasonal prediction system, the bias adjustment procedure will convert 10 m to 100 m winds by assuming a logarithmic wind profile (Drechsel et 435 al., 2012). The different variable names must be specified in the CST_Load call through the parameter var, since the function needs to read the correct variable written on the NetCDF files in the data storage. Therefore, the var parameter is set to "sfcWind" when retrieving hindcasts and forecasts, while for the reference dataset it is set to "windagl100". Given the difference in spatial resolution, a regridding of the reference dataset is also requested by the parameter grid. The path pointing to the simulations and the reference are also passed to the CST_Load function through parameters exp and obs, 440 respectively. Notice that the labels $STORE_FREQ$, $VAR_NAME$, $START_DATE$, $YEAR$ and $MONTH$ are used when defining the paths. These labels will be interpreted and substituted by the function following the information provided by the other parameters of CST_Load.
An index mm indicating the number of preceding months (mm) is introduced to loop over the three start dates in order to 445 simplify the code. When mm is 1, the bias adjustment for March with the forecasts initialized one month in advance (i.e. the February start date) is computed. The target year is set in the "year" variable as 2018. The start dates of the simulations to be loaded are created and stored in the "hcst_sdates" and "fcst_sdates" variables, which correspond to a vector of dates for the 1st of February from 1993 to 2017 and the 1st of February 2018, respectively. For the February start date, the lead time two (i.e. mm + 1) corresponds to the forecast for March which is selected through the leadtimemin and leadtimemax parameters.
Individual ensemble members typically also suggest much weaker wind speed anomalies than observed, except for one 510 member in the February initialization, indicating that in this case the prediction system anticipated this situation as a potential outcome. https://doi.org/10.5194/gmd-2021-368 Preprint. Discussion started: 6 December 2021 c Author(s) 2021. CC BY 4.0 License.

Figure 3: Seasonal forecasts of wind speed at 100 m height, averaged over 10 ºW-4 ºE and 36-44 ºN for March 2018. Each panel corresponds to forecasts launched 3 to 1 month(s) ahead (from left to right). Methodology: simple bias correction with ERA5
515 observations, based on previous hindcasts since 1993. An asterisk indicates the tercile with the highest probabilities.
The spatial distribution of the tercile probabilities can be displayed with the PlotMostLikelyQuantileMap function (Fig. 4).
An extra layer has been included to mark with crosses the grid points where observations agree on the most likely tercile indicated by the forecast. Three months in advance, most of the region shows that the tercile of highest probability is the 520 below-normal category. One and two month(s) in advance, the colours shift towards the normal and above-normal categories. In the January simulation, the eastern region presents more above-normal probability of high wind speed values than the western region. In the February simulation, the above-normal probability class is widespread on the whole Iberian Peninsula.

Figure 4: Probabilities of the most likely tercile for the March 2018 100 m wind speeds, as indicated by the forecasts issued 3 to 1 month(s) ahead (top to bottom). The crosses indicate that the observations fell into the most likely tercile displayed by the forecast. White grid points indicate that no tercile category has more than 40 % of probability.
Users that can benefit from climate information, such as stakeholders (e.g. energy system planners), are usually not familiar 530 with probabilistic forecasts and the added value that it could potentially bring to their planning. In order to become more autonomous in their decision making, a learning process could be started based on relevant show-case climate events such as the one provided here. Therefore, such use case could be of interest for climate services developers that need to post-process a seasonal forecast variable and present the results in a concise yet user-friendly manner with a reduced number of images and tables.

535
When an unsatisfactory outcome happens because of unfavourable atmospheric conditions, even including a seasonal forecast in the decision strategy, this code could be used to evaluate whether the seasonal forecast included the possibility of an unsatisfactory outcome, whether other variables were better capturing the situation in that case, or if a different bias https://doi.org/10.5194/gmd-2021-368 Preprint. Discussion started: 6 December 2021 c Author(s) 2021. CC BY 4.0 License. correction method would improve the skill of the seasonal forecast. Furthermore, it is possible to easily modify this code to 540 compare the results provided by different models.

Use case 2: Seasonal forecast of snow depth and snow water equivalent in high-elevation sites
Snowpack in the mountains represents an essential water reservoir that is fed by snowfall during the cold season and then released in late spring and summer. Mountain meltwater is essential for several economic activities including hydropower generation, agriculture, industry, and meltwater shortage can induce strong economic losses. Therefore, reliable seasonal 545 forecasts of snow resources that, at the beginning of the snow season (November) estimate the snow accumulation towards the end of spring (April-May), are highly pursued. These would allow water management authorities and hydropower companies to implement early water management plans several months ahead of a water-demand peak and mitigate the effects of a possible water shortage. To support this need, a modelling chain driven by seasonal forecasts of meteorological variables from the C3S seasonal forecasting systems was developed, employing the physical 1-dimensional snow model 550 SNOWPACK (Bartelt and Lehning, 2002), to estimate snow depth and snow water equivalent at selected high-elevation sites in the North-Western Italian Alps. A general scheme of this application is shown in Fig. 5. Important decisions include the downscaling method, the target region, the simulation and observation datasets as well as the season of interest. In this case, the region is the Alpine mountain range in central Europe (42-49 °N, 4-11 °E), including the high-elevation stations for which the SNOWPACK model is run.

555
The RainFARM downscaling method incorporated within CSTools is employed to downscale precipitation which is then used as input for the SNOWPACK model. This method allows taking into account the orographic effects on the precipitation distribution and generates a user-defined number of stochastic downscaling realizations for each member of the original seasonal forecast simulations. For each ensemble member of the seasonal forecast model, we generate 10 stochastic 560 downscaling realizations. In the following subsection, we present the method applied to the SEAS5 model providing 25 ensemble members, such that at the end of the downscaling procedure we obtain a total of 250 fine-scale precipitation fields.
Since the RainFARM downscaling relies on the estimation of the spatial power spectrum of precipitation fields, a squared domain is required. Moreover, this domain has to be larger than the target study area to avoid artifacts/border effects within 565 the target area. The target season is winter, so the 1st November start date simulations available for the period 1993-2018 are considered. Daily precipitation data of SEAS5 are downscaled from the original 1° spatial resolution. The reference datasets employed are i) ERA5 daily precipitation reanalysis at 0.25° (around 30 km) spatial resolution (Hersbach et al., 2020) for the bias correction and for the estimation of the spectral slopes and ii) the WorldClim2 monthly climatology at 1 km spatial resolution (Fick and Hijmans, 2017) for generating the precipitation weights to obtain a more realistic distribution.  In the scheme (Fig. 5), three steps should be carried out before applying the RainFARM method. In steps 1 and 3, necessary 575 parameters are computed: the spectral slopes and the orographic weights; quantile mapping correction is applied to the seasonal forecast in order to correct the bias of the model in step 2, and the downscaling is computed in step 4.
All computations performed in the first step only require the CSTools package. As mentioned above, the spectral slope is calculated using ERA5 at its original resolution and over a larger domain than the target region (37.5-53.25 °N, 2.5-18.25 580 °E). The path pattern to the data is defined using labels: $STORE_FREQ$, $VAR_NAME$, $YEAR$ and $MONTH$.
These labels will be interpreted by CST_Load. For instance, the $VAR_NAME$ will be substituted by the information passed by the parameter var which in this case is "prlr" that stands for precipitation rate and the $YEAR$ and $MONTH$ will be interpreted from the CST_Load sdates parameter which requires a vector of dates in the format "YYYYMM01" where YYYY is the year and MM the month. Then, CST_Load retrieves the data from files and arranges it with the 585 following dimensions: dataset of length 1 since only ERA5 is being requested, member = 1 since this reanalysis only provides one simulation, sdate dimension is of length 312 which corresponds to the 26 years of 12 months defined in object "years" with an ftime dimension up to 31 corresponding to each day of the month. The remaining dimensions, lat and lon correspond to the squared domain requested in CST_Load. Given that CST_Load splits the time series among sdates and ftime dimension when specifying a forecast dataset, our ERA5 path pattern has been requested through this option. On the 590 other hand, specifying the ERA5 path pattern as an observational dataset (in the obs parameter), the function will return a continuous time series from 1993 to 2018 which is less convenient for our purposes here.

610
In the second step, we load the data, taking advantage of library zeallot (Teetor, 2018) that allows us to simplify our code by using an advanced version of the assignment operator (%<-%). Again, the paths to the necessary data must be defined using labels: for the forecast data the path points to the SEAS5 dataset while for the reference data the path points to the ERA5 reanalysis. Thanks to CST_Load, these datasets could be reshaped onto a common grid, which, by default, is the grid of the 615 first dataset provided, i.e. the SEAS5 grid. The vector "StartDates" which defines the period of study for November 1st simulations, is then assigned to the sdates parameter. https://doi.org/10.5194/gmd-2021-368 Preprint. Discussion started: 6 December 2021 c Author(s) 2021. CC BY 4.0 License.

Code Step 2.
Step 3 computes the orographic weights from a fine-scale precipitation climatology. In this case, the WorldClim2 dataset 645 precipitation at 30 seconds resolution is used although other climatologies at high resolution could be used. The WorldClim2 dataset is formatted in tiff files that can be automatically downloaded in the R session thanks to the raster library (Hijmans, 2020). The piece of code for Step 3 shows how to compute the orographic weights for all individual months at once: getting the data from the remote dataset, subsetting for the Alps region with a small increment to correctly compute interpolation (3.5-11.5 °E, 41.5-49.5 °N), and storing the data in an "s2dv_cube" object to be passed to CST_RFWeights.

650
The target resolution is the one most suitable for each specific application. To run the SNOWPACK model, we are interested in the local scale and we choose a target resolution of 0.01°, corresponding to about 1 km. Therefore, the weights and the RainFARM method (step 4) would be computed with a refinement factor (nf) 100. However, such a high refinement factor implies a rather large computational load, and here we show the code using a refinement factor 4. We recommend the users 655 to approximately calculate the expected size of the final output, as follows: the original data input to the downscaling step has 25 members, 26 start dates, 31 daily lead times on 8 months covering a region of 8 by 8 grid points, which 8 times its product is ~ 80 MB; this size will increase by a factor 10 since the realizations and the refinement factor will be applied on both spatial dimensions. For a refinement factor of 100 (4), the expected output is ~80 MB x 10 x 100 x 100 (~ 80 MB x 10 x 4 x 4), so around 8 TB (12.5 GB). The users need to consider that the data size also has implications on the computation 660 time. The result of step 3 is an array with spatial dimensions and an extra dimension for each month containing the weights for which values greater (lower) than 1 will amplify (reduce) the precipitation signal from the seasonal forecast (Fig. 6).

685
Finally, the downscaling method is run in step 4 using the corrected forecast, the slope and weights computed in the previous steps. Note that this code requires high memory resources, although the computation can be split by start date and realization if necessary. Figure 6 shows the spatial resolution improvement given by RainFARM for a specific date when applying a refinement factor 4 or 100.
• Seasonal temperature forecasts were bias-corrected using the daily annual climatology of station observations • All other variables are bilinearly interpolated to the coordinates of the study-sites.

720
After the spatial downscaling, seasonal forecast data are interpolated to one hour temporal resolution with different methods depending on the variable (Terzago et al., in preparation Figure 7 shows an example of the SNOWPACK model 725 output, and specifically, the snow depth forecasts obtained from the SEAS5 forecast initialized on the 1st of November 2006, for the area including the station of Bocchetta delle Pisse -2410 m above sea level (a.s.l.). For each forecast, we obtain an ensemble of 250 snow depth /SWE simulations, derived from ten downscaling realizations of the precipitation field produced by 25 ensemble members of the SEAS5 forecast system. SEAS5-SNOWPACK forecasts are able to reproduce the variability of the observed snow depth (Fig. 7). For 2006/2007 in particular, the median forecast is lower than the model 730 climatology, which is consistent with the low amount of snow that was observed that winter.

740
In this last use case, we provide downscaled and bias-adjusted seasonal forecasts as input for the hydrological model SCHEME (Baguis et al. 2010), to simulate the river flow of the longest river in Greece, the Aliakmon. This will allow the generation of a seasonal ensemble prediction system that provides the outlooks of water availability for hydro-power and irrigation for the Aliakmon basin (Fig. B1).
The SCHEME hydrological model is the semi-distributed version of the daily time-step lumped model developed by Bultot 745 and Dupriez (1976). This model was designed first for the Scheldt and the Meuse River basins in Belgium and Northern France to estimate the impact of climate changes on the hydrological cycle, but it has since been used for hydrological predictions in the seasonal (Roulin and Vannitsem, 2005) and in data assimilation of large-scale satellite soil moisture (Baguis and Roulin, 2017). In addition to daily rainfall, the SCHEME model requires daily minimum, maximum and average temperature as input to calculate the potential evapotranspiration using the Hargreaves and Samani equation (e.g. Oudin et 750 al., 2005). These will be provided for a domain covering the Aliakmon basin and a large part of Greece.
For this use case, post-processing the seasonal forecasts is absolutely necessary as biases must be removed and much higher spatial resolution is necessary. Indeed, while the typical seasonal forecasts are provided at a resolution of around 100 km, the spatial resolution required by the hydrological model is 5 km. Our tailored approach focuses on providing high-quality and 755 high-resolution temperature and precipitation forecasts. An analog approach is used, which combine both the synoptic-scale pressure over Europe and the regional-scale rainfall over Greece. For the best analog day, the high-resolution fields of both Fig. 8 provides the step-by-step structure of the methodology used while Fig. 9 provides a visual representation of the post-processing chain. As shown in Fig. 8, the overall methodology can be separated into a bias adjustment phase 770 (steps 1-2) and a downscaling phase (steps 3-5) that uses an analog approach. The bias adjustment phase starts by loading (using CST_Load) the daily forecast and observational rainfall data over Greece at 1° resolution. As forecast data, we take the 25 members of SEAS5 from 1993-2019 initialized in May while the reference is the CHIRPS dataset upscaled to 1° resolution for the same time period. In step 2, these daily rainfall forecasts are bias-adjusted (using CST_Calibration with method bias) against the CHIRPS dataset. The bias adjustment is done per month of lead time 775 using a leave-one-out or cross-validation approach. Breaking up the data per month was done using the CST_SplitDim function.

780
The downscaling phase starts in step 3 (see Fig. 8) by loading the sea-level pressure (SLP) of the forecast and reference at 1° resolution over Europe. The reference dataset in this case is ERA5 (Hersbach et al., 2020) and upscaling is applied to obtain the 1° resolution. In step 4, both the bias-adjusted precipitation fields over Greece and the SLP anomaly fields over Europe of a particular forecast day are selected. These fields are then compared to all fields of a large climatological reference dataset in order to find the best analog (using the Analogs function). This dataset covers the period 1993-2019 but includes 785 only days with the same month as the selected day (and excludes the selected day). Separating the data per month was again done using the CST_SplitDim function. The criterion to find the best analog is called Local_dist and minimizes the Euclidean distances of the large-scale SLP and the local-scale rainfall patterns, both at 1° resolution. Finally, for the day corresponding to the best analog, the CHIRPS precipitation field at 0.05° resolution is then considered as the bias-adjusted and downscaled  Fig. 9c and 9d). In order to obtain the temperature, the ERA5-Land dataset over Greece is 790 considered for the day of the best analog. More specifically, the ERA5-Land daily minimum, maximum and average temperature at 0.1° resolution are downscaled to a resolution of 0.05° using lapse-rate height corrections. Finally, the downscaling procedure steps 4-5 are iterated over all days and all ensemble members of the seasonal forecast in order to obtain a fully bias-corrected and downscaled seasonal forecast over Greece.

Conclusions
CSTools contains state-of-the-art methods to post-process seasonal forecast datasets specially focusing on statistical correction and downscaling methods, as well as classification methods. These methods are extremely valued in the 800 community given the need of correcting intrinsic systematic model errors and the need of many final applications to have these forecasts in higher resolution than the original resolution provided by the forecast systems. On the other hand, the visualization tools tailored for probabilistic forecasts are able to summarize the results in a concise yet user-friendly manner (see e.g. Fig. 4).

805
Three use cases showcased the ability of the CSTools R package to successfully post-process seasonal forecasts in the context of scientifically advanced impact modelling. The final users potentially interested in these three use cases represent a classical current-day sample of the users (and disciplines) that can benefit from CSTools. The energy sector can see the utility of seasonal forecast post-processed with CSTools in all the use cases presented: the first one showed the potential of seasonal forecasts to anticipate high wind speed events in the Iberian peninsula and the impact it had on energy production 810 and prizes; the second and third cases could be of high interest for the hydrological energy sector since foreseeing months in advance the snow depths at high altitudes and the stream flow in catchments may allow hydropower managers to plan their activity. Similarly, these use cases are relevant for risk management of high wind speed, coastal and flooding events.
Two aspects of the CSTools design are highly valuable: the data loading step in which the user can get the forecast and the 815 reference dataset in a common structure (i.e. the "s2dv_cube" object) simplifying the subsequent data manipulation steps, and the internal use of the multiApply package in the data processing functions making them flexible to work with any number of dimensions and allowing parallel computation.
The development guidelines are a fundamental piece of documentation for the future extension of the package when new 820 state-of-the-art methods are required or become available. These guidelines are already being adopted by another R package