CORDEX-WRF v 1 . 3 : Development of a module for the Weather Research and Forecasting ( WRF ) model to support the CORDEX community

The ‘Coordinated Regional Climate Downscaling Experiment’ (CORDEX) is a scientific effort of the World Climate Research Program (WRCP) for the coordination of regional climate initiatives. In order to accept an experiment, CORDEX provides experiment guidelines, specifications of regional domains and data access/archiving. CORDEX experiments are important to study climate at the regional scale, and at the same time, they also have a very prominent goal in providing regional climate data of high quality. Data requirements are intended to cover all the possible needs of stake holders, and scientists work5 ing on climate change mitigation and adaptation policies in various scientific communities. The required data and diagnostics are grouped into different levels of frequency, priority, and some of them even have to be provided as statistics (minimum, maximum, mean) over different time periods. Most commonly, scientists need to post-process the raw output of regional climate models, since the latter was not originally designed to meet the specific CORDEX data requirements. This post-processing procedure includes the computation of diagnostics, statistics, and final homogenization of the data, which is often computa10 tionally costly and time consuming. Therefore, the development of specialized software and/or code is required. The current paper presents the development of a specialized module (version 1.3) for the Weather Research and Forecasting (WRF) model, capable of outputting the required CORDEX variables. Additional diagnostic variables not required by CORDEX, but of potential interest to the regional climate modeling community, are also included in the module. ’Generic’ definitions of variables are adopted in order to overcome model and/or physics parameterization dependence of certain diagnostics and variables, thus 15 facilitating a robust comparison among simulations. The module is computationally optimized, and the output is divided in different priority levels following CORDEX specifications (Core, Tier1, and additional) by selecting pre-compilation flags. 1 Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2018-241 Manuscript under review for journal Geosci. Model Dev. Discussion started: 1 November 2018 c © Author(s) 2018. CC BY 4.0 License.


Introduction
Regional climate downscaling pursues the use of limited area models (LAMs) to perform climate studies and analysis (Giorgi and Mearns, 1991).It is based on the premise that, by using LAMs, modelers can simulate the climate over a region at higher resolution compared to global climate models (GCMs).Therefore, certain aspects of the climate system can be better represented due to the higher resolution and higher complexity of the parameterizations (inherent in LAMs) used to simulate physical processes, which cannot be explicitly resolved (e.g., shortwave and longwave radiation, turbulence, dynamics of water species).This methodology has been widely used for studying climate features, connections, and processes (Jaeger and Seneviratne, 2011;Knist et al., 2014;Kotlarski et al., 2017) and to produce climate data within the scope of continental, national, or regional climate change studies.The Coordinated Regional Climate Downscaling Experiment (CORDEX; http://www.cordex.org/,last access: 18 March 2019) of the World Climate Research Program (WRCP) aims to organize different initiatives devoted to regional climate all around the globe following a similar experimental design (Giorgi et al., 2009;Giorgi and Gutowski, 2015).CORDEX, with the second phase currently under discussion, attempts to establish a series of criteria for dynamical downscaling experiments, which includes setting common domain specifications and horizontal resolutions in order to make sure that all the continental areas of the Earth are under study (e.g., in 2010 Africa was a priority and researchers worldwide volunteered to contribute with their own simulations).Furthermore, CORDEX sets a series of model configurations (e.g., GCM forcing, greenhouse gas (GHG) evolution) to ensure that model simulations are carried out under similar conditions and are therefore intercomparable.At the same time, CORDEX requires a list of variables necessary for the later use of model data in multimodel analysis and other climate-related research activities like climate change mitigation, adaptation, and stakeholder decision-making policies.In order to maximize and facilitate data access (mostly made available by the Earth System Grid Federation, ESGF; https://esgf.llnl.gov/,last access: 18 March 2019), these data also have to be provided following a series of homogenization criteria known as climate and forecast (CF) compliant (http://cfconventions.org/,last access: 18 March 2019), which come from the Coupled Model Intercomparison Project (CMIP) exercises.The list of variables required by CORDEX consists of standard model fields and some diagnostics in certain frequencies, as well as statistical aggregations such as minimum, maximum, or mean, for a given period.These variables are grouped into different priority levels (Core, Tier 1, and Tier 2), with Core being the mandatory list of variables (see Appendix A for more details).
The production of these datasets is not a simple task and usually represents a big issue for the modeling community.Regional climate experiments tend to produce large amounts of data, since scientists simulate long time periods at high resolutions.Modelers have to code software at least capable of (1) computing a series of diagnostics, (2) concatenating model output, (3) performing statistical temporal computations, and (4) producing data following CF-compliant (i.e., cmorization) criteria in NetCDF format (NetCDF stands for Network Common Data Form, a binary self-describing and machine-independent file format; https://www.unidata.ucar.edu/software/netcdf/, last access: 18 March 2019).Aside from being time-consuming due to complexity and process management, this codification also implies certain duplication of huge datasets and additional consumption of computational resources.
Several tools (e.g., NetCDF operators -NCOs, climate data operators -CDOs) exist to facilitate the manipulation of NetCDF files (extract, concatenate, average, join, etc.), and there are also other post-processing initiatives that have been made available, especially to the Weather Research and Forecasting (WRF; http://www.mmm.ucar.edu/wrf/users/,last access: 18 March 2019; Skamarock et al., 2008)  WRF is a popular model for regional climate downscaling experiments.It is used worldwide in different CORDEX domains (Fu et al., 2005;Mearns et al., 2009;Nikulin et al., 2012;Domínguez et al., 2013;Vautard et al., 2013;Evans et al., 2014;Katragkou et al., 2015;Ruti et al., 2016).The model was initially designed for short-term simulations at high resolutions, but a series of modifications that have been introduced to the model code so far have enhanced its capabilities and made it appropriate for climate experiments (Fita et al., 2010).Since WRF does not directly provide most of the required variables for CORDEX and due to the complexity of the post-processing procedures, many existing WRF climate simulations are not publicly available to the community.
This new module comes to complement the modifications introduced in CLimate WRF (clWRF; http://www.meteo.unican.es/wiki/cordexwrf/SoftwareTools/ClWrf,last access: 18 March 2019; Fita et al., 2010).In clWRF climate statistical values (such as minimum, maximum, and mean values) of certain surface variables were introduced into the model.At the same time, the evolution of greenhouse gases (CO 2 , N 2 O, CH 4 , CFC-11, CFC-12) can be selected from an ASCII file instead of being hard-coded.Before these modifications, WRF users could only retrieve statistical values by post-processing the standard output of the model (at a certain frequency).With the clWRF modifications (incorporated into the WRF source code since version 3.5) statistical values are directly computed during model integration.This new CORDEX module proposes one step further by incorporating a series of new variables and diagnostics that are important for climate studies and WRF users can currently only obtain by post-processing the standard model output.At the same time, additional variables have been added into the WRF capabilities of output at pressure levels.In the current module version if the "adaptive time-step" option is enabled in WRF, some diagnostics related to time-step selection (e.g., precipitation, sunshine duration, etc.) will not be calculated properly because there is no proper adaptation.
We present a series of modifications to the model code and a new module (version 1.3) that will enable climate researchers using WRF to get almost all the CORDEX variables directly in the model output.With the use of this module, production of the data for regional climate purposes will become easier and faster.These modifications directly provide the required fields and variables (Core and almost all Tier 1; see Appendix A for more details) during model integration and aim to avoid the post-processing of the WRF output up to a certain level.However, in this version, they do not cover all the previously mentioned aspects of the task, such as computation of statistics and the cmorization of the data.
New variables and diagnostics will be provided at the user-selected output frequency.The user still needs to postprocess the data in order to obtain the different statistics required by CORDEX at daily, monthly, and seasonal periods.The data cmorization can be defined as a series of processes that need to be applied to the model output in order to meet the standards provided under the CF guidelines (which follows the CMOR standard; https://pcmdi.github.io/cmor-site/,last access: 18 March 2019).These guidelines are designed to facilitate comparison between climate models, and they represent the standard for the Coupled Model Intercomparison Project (CMIP; https://cmip.llnl.gov/,last access: 18 March 2019).This standardization includes the file names, variable names, and metadata (units, standard name, and long name), specification of geographical projections, and time axis.In order to achieve a complete CF standardization of WRF output in complete agreement with the CF requirements, substantial changes to the WRF input/output (I/O) tools would be required.This would affect backward compatibility and it has been decided to pursue this in upcoming module updates.Therefore, the users of the CORDEX-WRF module will still need to perform part of the standardization.This includes joining and/or concatenating WRF files, making use of standard names and attributes of the variables and file names, and finally providing the right variable with the standard attributes to describe the time coordinate.
The module also aims to establish a series of homogenizations for certain diagnostics.These diagnostics can be computed following different methodologies, and consequently they may be model and/or even physical parameterization dependent.In order to avoid dependency on the model configuration (mainly sensitivity to the choice of the various available physical schemes), and to allow for a fair comparison between different simulations, a series of additional "generic" definitions of some diagnostics are presented when possible.
The modification of the WRF model code was initiated within the development of the regional climate simulation platform from the Institute Pierre Simone Laplace (IPSL) (https://sourcesup.renater.fr/wiki/morcemed/Home,last access: 18 March 2019) and the CORDEX Flagship Pilot Study (CORDEX-FPS), "Europe+Mediterranean; Convective phenomena at high resolution over Europe and the Mediterranean" (Coppola et al., 2018), in order to obtain the variables required for the CORDEX experiment (available at https://www.hymex.org/cordexfps-convection/wiki/doku.php?id=protocol, last access: 18 March 2019) and share the code among WRF users of the CORDEX-FPS experiment.
In this work the complete module is presented, its capabilities are demonstrated, and the results of several diagnostics are shown in order to illustrate the accuracy of the implementation.The initial section of the paper describes the modifications that have been introduced into the code, followed by a description of the variables required by CORDEX.The following section demonstrates the performance tests and gives a description of aspects that are currently missing, but will be added in the upcoming module versions.The paper finishes with a discussion and outlook.

The CORDEX module
Here we present the module and explain the modifications introduced.The steps necessary in order to compile and use the module are provided as well.For a complete and detailed description of the steps to follow, the reader is referred to the Wiki page of the module: http://wiki.cima.fcen.uba.ar/mediawiki/index.php/CDXWRF(last access: 18 March 2019).The most common README file provided with the module is labeled README.cordex.The module has been implemented following the standards of modularity, which facilitates the upgrading and introduction of new variables to it.cessing System) for the preparation of the initial and boundary conditions and the model itself.The source of the code is not fully provided.A pre-compilation process is carried out in order to automatically write certain parts of the code according to a series of ASCII files, and activation of certain parts of the code rely on pre-compilation flags.With the pre-compilation flags users can determine optional aspects of the model related to technical aspects of the compilation and the use of certain components like the incorporation of the Community Land Model version 4 (Oleson et al., 2010;Lawrence et al., 2011).Large parts of the code that are automatically written are related to the input/output of the model.There are a series of ASCII files provided in the Registry folder of the model called registry.These files contain the characteristics of the variables, mainly the name of the variable during execution, the rank and dimensions of the variable, assigned output file, the name of the variable in the output file, description of the variable, and units.The WRF model keeps all the variables in a Fortran pointer-derived type (called grid).At the same time, the WRF model setup is managed though the use of a Fortran namelist statement that reads the ASCII file called namelist.input,which has different sections.WRF manages the output via different streams (usually up to 23) with the standard output (wrfout+ files) being the number 0. The WRF model integrates the atmosphere using η as a vertical variable (see more detail in Skamarock et al., 2008) defined in Eq. ( 1) (where p surf is surface pressure, p top is pressure at top, p is hydrostatic pressure, η = 1 at the surface, and η = 0 at the top of the atmosphere).WRF uses three horizontal grids (two C grids staggered for winds) and two sets of vertical coordinates (one staggered known as the "full" η levels, with the un-staggered as the "half"-η levels).

Module implementation
The module is accompanied by a new registry file called Registry/registry.cordex in which the variables and namelist parameters related to the module are defined.The specific setup of the module is managed in the WRF namelist in a new section called cordex.Aside from the modifications of the code of the WRF model, the complete module currently consists of two new modules: phys/module_diag_cordex.F, the main module that manages the calls to the variables and performs the necessary accumulations for the calculations of statistical values (e.g., mean, maximum, minimum); and phys/module_diagvar_cordex.F, the module that contains the calculations of all the CORDEX variables separated into individual and independent 1-D Fortran subroutines.
A list of detailed information on the modifications introduced is given below.
1.The main call of the CORDEX module (module_diag_cordex.F) has been added to phys/module_diagnostics_driver.F, which accounts for the management of diagnostics, and it has been modified in order to introduce the new pressure-interpolated variables.
2. An input line to the registry.cordexhas been added into the general Registry/Registry.EM_COMMON.

The complementary pressure-interpolated vari-
ables have been introduced in the related registry Registry/registry.diags.
4. The complementary interpolated variables have been added in the module that performs the pressure interpolation (phys/module_diag_pld.F).
5. The initialization of the modified pressure interpolation has been added in dyn_em/start_em.F.

6.
Modifications have been introduced in the main/depend.commonand phys/Makefile files to get the module compiled.
7. Specific changes for the inclusion of the water budget variables have been introduced in the dyn_em/solve_em.F module in order to get the advection terms of all water species.
8.An ASCII file called README.cordex with the description and synthesized instructions for compilation and use is provided as well.
The model output is grouped into a single file (WRF's auxiliary history output or stream no.9) with a proposed file name (auxhist9_outname namelist parameter in the &history section): wrfcordex_d<domain>_<date>, regulated with the standard WRF namelist parameters of output frequency (auxhist9_interval), number of time steps per file (frames_per_auxhist9), and format (io_form_auxhist9).Additional CORDEX variables required at pressure levels have been included in the WRF auxiliary output file number 23.These introduced CORDEX variables follow the file setup via the currently existing namelist section called diags&.

Module use
Before the execution of WRF some preprocessing steps are necessary by the user that encompass the compilation of the code and its specific setup to be used during the execution time of the model.These are described in the following subsections.

Compilation
Pre-compilation flags need to be defined by the user, depending on his or her requirements.It is necessary to keep in mind that this is done due to efficiency constrains (see below in Sect.6), although it is not a common procedure in the standard use of WRF.Usually WRF has almost all options available from a single compilation, switching options via the namelist.
Using the pre-compilation flag CORDEXDIAG, the CORDEX Core variables will be produced.The Tier 1 and additional groups of variables can be selected via the additional pre-compilation flag CDXWRF (CDXWRF=1 for Tier 1 and CDXWRF=2 for Tier 2 and the additionals).The reader is referred to Appendix B for more details about the groups of CORDEX variables associated with each option.The registry file (registry.cordex)has to be manually modified accordingly to the selected pre-compilation flag (uncomment the associated lines).
In order to adapt this derived type to the preselected compilation, it is also necessary to modify the module's specific register file (register.cordex)according to the chosen value given to the additional pre-compilation CDXWRF flag (if used).This is done in a way to control the size of a gridderived type, which has a positive impact on the model performance (see below).For a complete and detailed description of these steps, the reader is referred to the Wiki page of the module at http://wiki.cima.fcen.uba.ar/mediawiki/index. php/CDXWRF.
According to the value given to the pre-compilation CDXWRF flag, a different amount of variables is written out to the wrfcdx output file (see more detail in Appendix B).
-Using CORDEXDIAG and without CDXWRF, all the CORDEX Core variables will be calculated.
- Moreover, the code also accounts for instantaneous CORDEX variables provided as statistics (e.g., capemean, tdsmax, or all the water budget variables).In order to get them, the user must follow certain modifications of the code (and recompilation) in phys/module_diag_cordex.F, phys/module_diagnostics_driver.F, and in the registry file registry.cordex.

Usage
Modifications of the module include two main sets of variables: (1) new variables and diagnostics and (2) additional variables interpolated at pressure levels.These two sets of variables are provided in two separated files.A new auxiliary output file in the ninth stream provides all the new variables and diagnostics required by CORDEX.Additional pressureinterpolated variables are included in the 23rd stream.Each of these files has to be set up in the namelist in the same way as with the standard WRF output files.
A new section labeled cordex has to be added into the WRF's namelist, which allows users to choose or set up different options of the module.The description of all the available options is provided in Table 1.In this section the user is required to choose the implementation of the diagnostics to use, provide values to some parameters for certain diagnostics, and activate or deactivate some of the most computationally costly diagnostics.Default values for all the options are provided in order to facilitate the use of the module.
This module has been tested under different highperformance computing (HPC) environments and compilations.It has been compiled with two compilers: GFortran and IFORT.Different parallelization paradigms include serial, distributed memory, and hybrid (distributed and shared), as well as the parallelized version of the NetCDF libraries.
The tests have been performed mainly using two-nested domains with the second one being at convection-permitting resolution (no cumulus scheme activated).Under all these circumstances the module worked as expected.
Since the current version (v1.3) of the module, a text message with the version of the module has been printed in the standard output at the first time step of the model run in order to facilitate the detection of the module version that is being used.

CORDEX variables
CORDEX requires a series of mandatory variables grouped into the Core level and additional variables grouped into Tier 1 and Tier 2 levels.Furthermore, CORDEX also requires statistical values of specific variables, besides the instanta- wind gust following heavy precipitation method 1 output_wb = 0 deactivation of the computation of water budget variables (Fita and Flaounas, 2018) 0 output_wb = 1 activation of the computation of water budget variables 0 wsz100_diag = 1 wind extrapolation at z100m_wind using power-law method 1 b wsz100_diag = 2 wind extrapolation at z100m_wind using logarithmic-law method 1 wsz100_diag = 3 wind extrapolation at z100m_wind using Monin-Obukhov method 1 z100m_wind = 100.height (m) to extrapolate winds for wsz100_diag 100 zmlagen_dqv = 0.1 percentage of variation in mixing ratio to determine mixed-layer depth used in zmlagen computation (Nielsen-Gammon et al., 2008) 0.1 b zmlagen_dtheta = 1.5 increment in K of potential temperature from its minimum within the MLD used in zmlagen computation 1.5 b potevap_diag = 1 potential evapotranspiration using bulk computation (Manabe, 1969) 2 b potevap_diag = 2 potential evapotranspiration using Milly92 correction (Milly, 1992) 2 convxtrm_diag = 0 deactivation of diagnostic of extremes from convection indices 0 convxtrm_diag = 1 activation of diagnostic of extremes from convection indices 0 fogvisibility_diag = 1 diagnostic of visibility inside fog (Kunkel, 1984) 3 b fogvisibility_diag = 2 Rapid update cycle (RUC) method (Smirnova et al., 2000) 3 fogvisibility_diag = 3 FRAML 50 % prob (Gultepe and Milbrandt, 2010) 3 neous values.To meet the CORDEX specifications, regional climate models have to provide three kind of variables.
-Instantaneous: values obtained at each model integration time step.An instantaneous value represents the field at the given instant of time over the given space encompassed within the grid point.
-Statistics: values obtained as statistics of consecutive instantaneous values along a given period of time.The statistical computation could be minimum, maximum, mean, or accumulated values, as well as the flux.Thus, one statistical variable represents the temporal statistics of the field for a given period of time over the given space encompassed within the grid point.CORDEX guidelines also require different temporal aggregations: 3-hourly, daily, monthly, and seasonal.
-Fixed: values that do not have an evolution in time.These fields are fixed over the simulation.
The WRF I/O file managing system provides an infrastructure for more than 20 different output files (called streams) at the same time.Each file is independently managed, and therefore in the namelist a user has to set up mainly two different options for each output stream: the frequency of an output (frequency of writing out the variables to an output file in minutes, e.g., 30, 60) and the number of frames per file (e.g., for 3-hourly frequency eight frames per file will give a daily output).Variables can be written in multiple streams (selected via the registry + files).During the model integration, at the given time step corresponding to the defined output frequency, data will be written out to the output file.When a given file reaches the selected amount of frames, it is closed and a new one is opened.The file name usually follows the criteria of a given header name (e.g., wrfcdx for this module) and the current date of the simulation, which is also set up in the namelist.
The CORDEX-WRF module is designed to provide the variables using the ninth stream, without reducing any of the capabilities of the model.Following this criterion, the module uses the same structure and components of the model designed to manage its I/O.This means that the statistical values are directly provided using the internal values between output frequencies.This ensures that, for example, a minimum value would be exactly the minimum value that the model simulated between output times.These variables are re-initialized after each stream output time (see Fig. 1).The WRF model is used in a myriad of applications and regions, and thus it was decided that statistical values will be provided at the selected frequency of the ninth stream.This gives more flexibility, allowing a user to get, e.g., highfrequency outputs.However, this will require the user to perform a post-processed aggregation of the output files in order to provide the required CORDEX statistics at the 3-hourly, daily, monthly, and/or seasonal periods.Users are strongly encouraged to use the output frequencies for the ninth stream, which are easy to combine in order to retrieve the required CORDEX statistics.It is necessary to highlight the fact that the statistics for a given period contained in the ninth stream correspond just to the instant in time of writing the field into the file (e.g., at a 3 h output frequency the value inside the file at [HH]+3:00:00 represents the statistics from [HH]:00:00 to [HH]+2:59:59).The WRF I/O does not allow us to produce static or fixed fields, and therefore this group of variables is not provided by the module.

Generic methodology
The list of variables requested by the CORDEX experiment (see the tables in Appendix B) is intended to be useful for the climate change mitigation, adaptation, and decision-making communities.Note that CORDEX-FPS might require other variables or require some of them at a different frequency of output in comparison to a standard CORDEX requested list of variables.Taking into account the performance of the model, variables are computed at specific frequencies: (1) at all time steps when a statistical value (accumulation and/or flux, minimum, maximum, and/or mean) of the variable is required, (2) at the given time step when a variable that is used for the diagnostic is updated following the configuration from the namelist (e.g., cloud-derived variables and the frequency of activation of the radiation scheme), and (3) instantaneous values that are computed only at the time step when the output is written out (see Fig. 1 for more details).
The list of added variables to the existing direct level pressure-interpolated output is provided in Table 2 and gathered in the 23rd auxiliary WRF output file with the standard name wrfpres_d<domain>_<date> (in WRF's namelist notation).At the same time, in order to avoid overloading the execution of WRF, the section of the code with  the pressure interpolation has also been modified.Now the interpolation is only computed at the time step coincident with the output frequency (also selected in the namelist along with the characteristics of the pressure interpolation).
The different statistical values are initialized (as shown in Fig. 1) at the first time step after output time.More details on how certain diagnostic variables have been integrated and implemented in WRF are provided in the following sections.Furthermore, a series of plots accompanying different definitions of the diagnostic variables are presented as well.The intention of these figures is to illustrate the consistency of the implemented diagnostics.These preliminary outcomes are not for validation purposes, but rather to show that the diagnostic variables have been correctly introduced.A complete analysis in order to find the most accurate methodology for the calculation of a certain diagnostic would require devoted climate simulations and enough observations to validate them.Such a validation is out of the scope of this study.We do select "more appropriate" options based on the experience within the scientific community or according to the CORDEX specifications.These options are set as the "default" options within the namelist.
One should be aware that certain diagnostics use variables for their calculation that might only be available when specific physical schemes are selected.When this happens, zero values are returned.This undesired outcome is, when possible, fixed by using a generic definition of the diagnostics.

Core variables
These are the basic variables required by CORDEX.Most of them are standard fields and therefore tend to require simple calculations from the currently available variables from the WRF model.These variables are obtained by setting the precompilation flag CORDEXDIAG and will appear in two different files: 3-D variables at pressure levels (the WRF model internally interpolates them since it uses the η coordinate in the vertical) that will appear in the output file with the 23rd stream (mainly called wrfpress) and the 2-D variables in the module's output file wrfcdx.

3-D at pressure levels
These are the additional variables that have been added into the WRF pressure-level integration module.Their values will be written in the 23rd output stream in addition to the ones currently available.All of them are instantaneous values.

hus: humidity
3-D atmospheric specific humidity (hus)1 and relative humidity (hur) are computed at the un-staggered model (half-)η levels.Specific humidity is simply obtained from the water vapor mixing ratio using Eq.(2) (named QVAPOR in WRF).Relative humidity can be obtained following the Clausius-Clapeyron formula and its approximation from the wellknown August-Roche-Magnus formula for saturated water vapor pressure e s (Eqs. 3,4,and 5).

press: air pressure
The WRF model integrates the perturbation of the pressure field from a reference one.Thus, to obtain the full pressure at un-staggered model η levels, the user is required to combine two different fields as shown in Eq. ( 6): where PB is WRF base pressure (Pa), and P is WRF perturbation pressure (Pa).

ta: air temperature
This variable stands for the 3-D atmospheric temperature on un-staggered model η levels.WRF model equations are based on the perturbation of potential temperature, and therefore a conversion to actual temperature is required, which is performed as indicated by Eq. ( 7): ta = (T + 300) where T is WRF 3-D temperature output (as potential temperature perturbation from the base value, which in WRF equals 300 K), and p0 is the pressure reference (100 000 Pa).
ua/va: Earth-rotated wind components These variables stand for the 3-D atmospheric wind components following Earth coordinates on un-staggered model η levels.WRF model equations use the Arakawa C horizontally staggered grid with wind components following the grid direction.In order to get actual winds following Earth geographical coordinates, a transformation shown in Eq. ( 8) is required: where ), cos a is the local cosine of map rotation (1), and sin a is the local sine of map rotation (1).

zg: geopotential height
As in the case of air pressure, the WRF model also integrates the perturbation of the geopotential field from a reference or base one.Thus, to obtain the full geopotential height on staggered model η levels, the user is required to combine the two WRF fields, and it is also de-staggered as shown in Eq. ( 9): where PHB is WRF base geopotential height ( and zg is un-staggered geopotential height k = [1, dimz].

Two-dimensional
Here we provide a list of the added two-dimensional CORDEX variables.Some of them are diagnosed as a combination of three-dimensional variables, some are required as instantaneous values, and others as statistics.The fact that the module provides 2-D variables online using 3-D fields shows one more key advantage of the module related to disk space.Thanks to these online calculations when using the module, a user no longer needs to store large amounts of 3-D data from the model in order to post-process them.This reduces the requirements of disk space by a factor of around 2.
pr, prc, prl, prsh, prsn: precipitation fluxes The total precipitation flux (pr) is computed as the sum of all types of precipitation fields in the model accumulated along the ninth stream output frequency (9freq) divided by this period of time (9freq), as shown in Eq. ( 10): where RAINCV is instantaneous precipitation from the cumulus scheme (kg m −2 ), RAINNCV is instantaneous precipitation from the microphysics scheme (kg m −2 ), RAINSHV is instantaneous precipitation from the shallow-cumulus scheme (kg m −2 ), Nsteps is the number of time steps, and δt is the time-step length (s) to achieve the ninth stream frequency output time (9freq = Nsteps × δt).
In this version, the computation of the accumulated values does not take into account configurations of the model with an adaptive time step.When an adaptive time step is used, we strongly discourage the use of these variables.
Each individual precipitation flux is also provided as follows.
Solid precipitation flux (prsn) only accounts for frozen precipitation.Depending on the selected microphysics scheme chosen in the namelist, this variable might account for the precipitation of snow, graupel, and hail.It is computed as shown in Eq. ( 12): where prins is instantaneous total precipitation (kg m −2 , previously obtained), and SR is the fraction of solid precipitation (%, variable provided by WRF).

Radiative flux
Surface upwelling shortwave radiation flux (rsus, W g m −2 ) and surface upwelling longwave radiation flux (rlus, W g m −2 ) are understood as the shortwave and longwave radiation from Earth's surface.They are directly provided by the radiation schemes CAM (Community Atmosphere Model) and RRTMG (Rapid Radiative Transfer Model) (sw_ra_scheme = 3,4) as instantaneous variables swupb and slupb.When there is no use of such schemes, it is recommended to use the generic definition instead (rsusgen, rlusgen; see next section).Statistical retrieval for the surface fluxes follows the same methodology as for the precipitation fluxes.
Outgoing radiative fluxes at the top of the atmosphere are also provided as "rsut" for mean top-of-atmosphere (TOA) outgoing shortwave radiation (in W g m −2 ) and "rlut" for longwave.However, there is no generic implementation of these variables.

sund: duration of sunshine
This variable accounts for the sum of the time for which the direct solar irradiance (downwelling shortwave radiation, rsds) exceeds 120 W m −2 (WMO, 2010a); it is implemented following Eq.( 13) and provided in seconds.In order to provide an example of the correct implementation of this diagnostic, preliminary results are shown in Fig. 2. The figure shows the "sund" values and compares them with the incoming solar radiation.It is shown how the "sund" values vary where SWDOWN is downward shortwave radiation (W m −2 ), and δt is time-step length (s).

tauuv: surface downward wind stress
Instantaneous surface downward wind stress at 10 m accounts for the force that winds exert on the Earth's surface.
It is implemented following Eq.( 14): where C D is the drag coefficient (1), uas is Earth-rotated eastward 10 m surface wind (m s −1 ), and vas is Earth-rotated northward 10 m surface wind (m s −1 ).The drag coefficient is nonzero only for certain options of the surface-layer physics (sf_sfclay_physics parameter in the namelist): 1 (MM5-similarity) or 5 (MYNN surface layer).A generic formulation has been introduced when these schemes are not used.

psl: sea level pressure
This variable accounts for the instantaneous pressure extrapolated to the sea level.It represents the value of the pres-sure without the presence of orography.In order to provide a framework ready to implement different methodologies, three different methods have already been implemented.The choice of method can be controlled by a new namelist.inputparameter labeled psl_diag in the cordex section.The implemented methods are the following: the hydrostatic Shuell method (Stackpole and Cooley, 1970) already implemented in the module phys/module_diag_afwa.F, assuming a constant lapse rate of −6.5 K km −1 , selected when in the WRF CORDEX namelist section setting the parameter [psl_diag = 1]; the ptarget method (Benjamin and Miller, 1990) that uses smoothed surface pressure and a target upperlevel pressure, already implemented in the WRF postprocessing tool called p_interp.F90 [psl_diag = 2]; and the ECMWF method (Yesad, 2015) taken from the Laboratoire de Météorologie Dynamique GCM (LMDZ; Hourdin et al., 2006) from the module pppmer.F90, following the methodology by Mats Hamrud and Philippe Courtier from ECMWF [psl_diag = 3].
According to the CORDEX specifications, the default method is the ECMWF method.When choosing the ptarget method (psl_diag = 2), the degree of smoothing of the surface using the surrounding nine-point average can also be chosen by selecting a number of smoothing passes (psmooth, default 5), and the upper pressure that has to be used as the target (ptarget, default 700 hPa).
Figure 3 shows the different outcomes when applying each method.There are some problems with the ptarget method in both psl estimates (mountain ranges can still be inferred) and borders for each parallel process (lines in figures showing differences among methods) when the spatial smoothing is applied.Lines showing the limits of the parallel processes appear because one cannot obtain the proper values from outside the correspondent tile of the domain associated with each individual parallel process.

Cloud-derived variables
Four cloud-derived variables are required by CORDEX: the total cloudiness (clt) and the cloudiness for each grid point at three different vertical layers aboveground (low: p ≥ 680 hPa, labeled cll; medium: 680 < p ≥ 400 hPa, clm; high: p < 400 hPa, clh).These cloud diagnostics are provided as mean values.
The module computes these variables taking the cloud fraction of a given grid cell and level as input.The cloud fraction in WRF is computed by the radiative scheme, and it is called at a frequency given by the radt parameter in the WRF namelist.Due to the large computational cost of the radiative scheme, radt is usually larger than the time step of the simulation.This determines when cloud fraction is also actualized to meet the evolved atmospheric conditions.Cloud fractions can be computed in the model using different methodologies.It would be possible to make available these methodologies as another choice in the namelist section and then compute the cloud fraction at each time step.However, in order to be consistent with the radiative cloud effects that the simulation is experiencing, this method was discarded.Thus, the cloud values provided by the module follow the same frequency of refreshing rate as the one set for radiation in the namelist level (radt value).
The most common implementation of clt found in other models (in particular most GCMs) assumes "random overlapping".Random overlapping assumes that adjacent cloud layers are from the same system and are hence overlapped at a maximum (Geleyn and Hollingsworth, 1979).In the module, the methodology from the GCM LMDZ was implemented.In this GCM, calculation of the total cloudiness and different layers' cloudiness is done inside the subroutine newmicro.f90.The method basically consists of a product of the consecutive nonzero values of cloud fraction as shown in Eq. ( 15): , where CLDFRAC is the cloud fraction (1) at each vertical level, zclear is the clear-sky value (1), zcloud is the cloudsky value (1), and ZEPSEC is a value for a very tiny number.The same methodology as in Eq. ( 15) is applied for the diagnostic of clh, clm, and cll but splitting by corresponding pressure layers.Figure 4 illustrates the result of the implementation and compares the results with the actual values of the cloud fraction (panels a and b), as well as the different cloud distributions (panels c to f).

Wind-derived variables
CORDEX requires two wind-derived diagnostics: the daily maximum near-surface wind speed of gusts (wsgsmax) and the daily maximum wind speed of gusts at 100 m aboveground (wsgsmax100).These variables cannot be retrieved by post-processing the standard output since they require the combination of different variables (some of them are not available from the model output) and have to be produced as a maximum value.The module provides different ways to compute them under certain limitations.
wsgsmax: maximum near-surface wind speed of gusts.The wind gust accounts for the wind from upper levels that is projected to the surface due to instability within the planetary boundary layer.In the current version of the module two complementary methods of diagnosing the variable have been implemented (resultant winds are Earth-rotated).The choice between the two methods is done by the parameter labeled wsgs_diag (in cordex section), with the default value set to 1.The implemented methods are the following (in italics).
-The Brasseur method [wsgs_diag = 1].This is an implementation of wind gust considering turbulent kinetic energy (TKE) estimates and stability defined by virtual temperature (θ v ) as indicated in Eq. ( 16) following Brasseur (2001).Implementation is adapted from a version already introduced in the CLimate WRF (clWRF; http://www.meteo.unican.es/wiki/cordexwrf/SoftwareTools/ClWrf; Fita et al., 2010): where TKE is turbulent kinetic energy (m 2 s −2 ), and θ v is virtual potential temperature (K).z p is the height of the considered parcel (m, maximum height that satisfies Eq. 16), and θ v (z) is the variation in θ v over a given layer (K m −1 ).
-The AFWA method [wsgs_diag = 2].This is an implementation adopted from the WRF module module_diag_afwa.F that calculates the wind gust   that only occurs as a blending of upper-level winds, zagl (around 1 km aboveground; zagl(k 1000 ) ≥ 1000 m; see Eq. 17), when precipitation intensity per hour is above a given maximum value (prate mm h ≥ 50 mm h −1 ): va 1 km = va(k 1000 − 1) where va is air wind (m s −1 ), zagl is height aboveground (m), and k 1000 is the vertical level at which zagl is equal to or above 1000 m; prate mm h is the hourly precipitation rate (mm h −1 ), and va blend is blended wind (m s −1 ).
The two methods provide wind-gust estimation (WGE) from two different perspectives: mechanic and convective.In order to take into account both perspectives, additional variables are used: totwsgsmax (total maximum wind-gust speed at the surface), totugsmax (total maximum wind-gust eastward speed at the surface), totvgsmax (total maximum windgust northward speed at the surface), and totwsgspercen (percentage of time steps along 9freq in which a grid point got a wind gust in %). Figure 5 shows the outcomes when applying each method.It is shown how wind gust is mainly originated by turbulence, with a minor impact of heavy precipitation events at the depicted time.Furthermore, in the bottom panels it is shown how wind gusts are highly frequent above the sea in comparison to the low frequency above continental flat areas (the Andes mountain range exhibits a high occurrence of wind gust).
wsgsmax100: daily maximum wind speed of gusts at 100 m.The calculation of wind gusts at 100 m should follow a similar implementation as used for calculating the wsgsmax, but at 100 m.An extrapolation of such turbulent phenomena would require a completely new set of equations that have not been established yet.However, it could be considered as a first approach to take the same wind gust as the one at the surface (when it is deflected from above 100 m).
The assumption would be that the wind gust at 100 m corresponds to the deflected wind on its "way" to the surface.Instead, as a way to complement this, the estimation of maximum wind speed at 100 m is provided.Provided wind components are also Earth-rotated.Three different methods have been implemented: two following assumed vertical wind profiles (after the PhD thesis of Jourdier, 2015) and a third one following Monin-Obukhov theory to estimate the wind components aboveground.These three methods are chosen by the namelist parameter labeled wsz100_diag.Its default value is 1.The implemented methods are as follows.
following Monin-Obukhov theory.The user should keep in mind that this method is not useful for heights larger than a few decimeters (z > 80 m).The wind at a given height is extrapolated following turbulent mechanisms.As shown in Eq. ( 20), the surface wind speed is used as a surrogate to estimate the 100 m wind direction (θ 10 = tan −1 (uas, vas), without considering Eckman pumping or other effects on wind direction).In this implementation u * in similarity theory is taken because the WRF estimates UST and Monin-Obukhov length (L O ) as the WRF values for RMOL and roughness length (z 0 ), with the WRF thermal time-varying roughness length ZNT: θ 10 = atan V 10 U 10 va 100 = ua 100 = wss 100 cos(θ 10 ) va 100 = wss 100 sin(θ 10 ) , where wss 100 is wind speed at 100 m (m s −1 ), M is the stability function after Businger et al. (1971), UST is u * in similarity theory (m s −1 ), z 0 is the roughness length (m), U 10 and V 10 represent 10 m wind speed (m s −1 ), L. Fita et al.: WRF module for CORDEX output θ 10 is the 10 m wind speed direction (rad), ua 100 is the 100 m eastward wind speed (m s −1 ), and va 100 is the 100 m northward wind speed (m s −1 ).Note the absence of correction in wind direction due to Ekman pumping or other turbulence effects.
The user can also select the height at which the estimation is computed throughout the namelist parameter z100m_wind (100 m as the default value).Figure 6 shows different preliminary results using the three different approximations.It is illustrated (panel a) how wind gusts are larger than the 10 m diagnostic winds, and the difference is also larger when using the Monin-Obukhov method compared to the two other methods.Certain problems (too-small Monin-Obukhov length) are recognized when applying Monin-Obukhov for extrapolating wind at 100 m, which is shown in panel (b), where wind gusts appear to be as strong as 80 m s −1 .Therefore, users are advised to adopt this method with care.

Vertically integrated variables
The instantaneous vertically integrated amount of water vapor (prw), liquid condensed water species (clwvi), ice species (clivi), graupel (clgvi), and hail (clhvi) are the vertically integrated amounts of each species along the vertical column (density weighted) over each grid point.They are provided using the same implementation as those in the p_interp.F WRF tool for vertical interpolation.The general equation following WRF standard variables is where clvivar is the column vertically integrated variable's CF-compliant name (prw, clwvi, clivi, clgvi, or clhvi), MU is the perturbation dry air mass in the column (Pa), MUB is the base-state dry air mass in the column (Pa), g is gravity (m s −2 ), e_vert represents the total number of vertical levels, WRFVAR is the water species mixing ratio at each sigma level (kg kg −1 ), and DNW is the difference between two consecutive full η levels (-).Table 3 indicates the WRFVAR names associated with the clvivar names.Note that clgvi and clhvi are part of the Tier 1 level and are only accessible if the pre-compilation variable CDXWRF is set to 1. See Sect.2.3 for more detail.In order to provide an example of the correct computation of the diagnostics, results at a given grid point are shown in Fig. 7.It is shown that the total precipitable water (prw) correctly corresponds to the density-weighted vertical integration of the water content along the column of air.

evspsblpot: potential evapotranspiration
This variable represents the evaporative demand of the atmosphere.It is computed following the standard method already implemented in most GCMs.One of the first proposed methods was provided by Manabe (1969).Some corrections have been proposed to the initial methodology in order to overcome its deficiencies (see, e.g., Barella-Ortiz et al., 2013, for an intercomparison among different methods).It is provided as an averaged flux.Calculation of the potential evapotranspiration can be activated with the namelist input parameter potevap_diag (number 2 is the default option).
It basically consists of a difference between a supposed saturated air at the surface temperature and the humidity of the atmosphere as depicted in Eq. ( 22): where ws(ts) is saturated air at ts (kg kg −1 ), qc is the surface drag coefficient (m s −1 ), TSK is surface temperature (K), ws(ts) is saturated air by surface temperature (kg kg −1 ) based on the August-Roche-Magnus approximation, U 10, V 10 represent the 10 m wind components (m s −1 ), QVAPOR is the 3-D water vapor mixing ratio (kg kg −1 ), and C D is the drag coefficient (-, only available from MM5 similarity and MYNN surface-layer schemes; otherwise, it is zero).
-The Milly92 method [potevap_diag = 2] makes a correction of the bulk diagnostic by introducing a Milly correction parameter ξ , which accounts for other atmosphere-related phenomena (Milly, 1992).It is explained in Eq. ( 23), and its implementation is similar to the one present in the ORCHIDEE model (Organising Carbon and Hydrology in Dynamic Ecosystems; http: //orchidee.ipsl.fr/,last access: 18 March 2019; de Rosnay et al., 2002).The implementation is retrieved from the module src_sechiba/enerbil.f90: where β is the moisture availability function, sfcevap is QFX surface evaporation (kg m −2 s −1 ) from QFX, which is the surface moisture flux (kg m −2 s −1 ), L is the latent heat of vaporization, EMISS is emissivity (1), CtBoltzmann is the Stefan-Boltzmann constant, Cp is the specific heat of air, and ∂ T ws(T ) is the derivative of saturated air at the temperature of the first atmospheric   layer (kg kg −1 ) using numerical first-order approximation.
See Fig. 8 for an example of the differences between the two implementations.It shows how important the correction introduced by Milly is and its strong effect on the diurnal cycle.Basically, the correction permits potential evapotranspiration during nighttime and reduces its strength at noon (18:00 UTC corresponds to 12:00 local time).There is a generic diagnostic to overcome boundary layer scheme dependency in the calculation of the drag coefficient (see the sections below on generic variables).

Generic variables
Some of the diagnostics required by CORDEX depend on the approximations, equations, methodologies, and observations used to compute them.This makes model intercomparison very difficult, since values might differ from one implementation to another.In order to overcome this problem, a series of variables is also provided in a generic form (when possible), meaning that they are obtained directly from standard variables.Thus, these generic forms of the diagnostics become "independent" of the model's implementation.

cdgen: generic surface drag coefficient
Computation of the instantaneous drag coefficient at the surface depends on the selected surface scheme.In order to avoid this scheme dependency, a generic calculation of the coefficient has been introduced as in Eq. ( 24) following Garratt (1992): with UST u * being friction velocity from the similarity theory (m s −1 ), and wss = U 10 2 + V 10 2 the 10 m wind speed (m s −1 ).
tauuvgen: generic surface downward wind stress Generic surface downward wind stress at 10 m is calculated following Eq.( 25), which uses the generic diagnostics of the drag coefficient: where C gen D is the generic drag coefficient (-, see Eq. 24), uas is the Earth-rotated eastward 10 m wind component, and vas is the Earth-rotated northward 10 m wind component.

rsusgen: surface upwelling shortwave radiation
Surface upwelling shortwave radiation is the shortwave radiation directed from the surface.It is calculated in a generic way as the reflected shortwave radiation depending on the surface albedo as shown in Eq. ( 26): where ALBEDO is albedo (1), and SWDOWN is downward shortwave radiation at the surface (W m −2 ).

rlusgen: surface upwelling longwave radiation
Surface upwelling longwave radiation is the longwave radiation coming from the surface.It is calculated in a generic way as the longwave radiation from a black body due to surface temperature following the Stefan-Boltzmann law as given in Eq. ( 27): where CtBoltzmann is the Stefan-Boltzmann constant (5.67051 × 10 −8 W m −2 K −4 ), EMISS is surface emissivity (1), and TSK is skin temperature (K).

evspsblpotgen: generic potential evapotranspiration
This variable corresponds to the generic definition of potential evapotranspiration (evspsblpot).The same two methodologies as in the regular diagnostic have been implemented.The only difference is that, in this case, the generic estimation of the drag coefficient "cdgen" is used (see Eq. 24) instead of the one given by the model.

Tier 1 variables
These diagnostics are required by CORDEX, but they are not mandatory.They have also been included as a way to fulfill all the CORDEX requirements.These variables require the setting of the pre-compilation flag CDXWRF to 1 and performing some complementary modifications in the module's registry file registry.cordex.See Sect.2.3 for more details.Instantaneous planetary boundary layer (PBL) height is a clear example of model dependence and even scheme dependence of how a diagnostic is computed.Each PBL scheme has its own assumptions, and "zmla" is computed in a scheme-dependent specific way.
In order to overcome the model and/or scheme dependence, we implemented a generic formulation for calculating the PBL height as was done in García-Díez et al. (2013) after Nielsen-Gammon et al. (2008).The method consists of defining the height of the PBL as the first level in the mixed layer (ML) where potential temperature exceeds the minimum potential ML temperature by more than 1.5 K.It has been implemented using the definitions given below.
1. Mixed-layer depth (MLD) is defined as the model level (k MLD ) starting from the second model level at which the variation of the mixing ratio (QVAPOR(k), normalized with its value at the first level) exceeds some predefined threshold value (QVAPOR(1)): > δQVAPOR (here applied as δQVAPOR = 0.1).

Within the MLD the value with the minimum potential temperature is taken as θmin
3. The level of the PBL height (k zmla ) is the level at which the maximum variation of potential temperature within the MLD exceeds some predefined threshold value: θ (k zmla ) − θmin MLD > δθ , (here δθ = 1.5 K).
4. The PBL height (zmla) is obtained using the geopotential height zg at the calculated k zmla level above the ground (zagl): zmla = zagl(k zmla ) = zg(k zmla )/g − HGT, with HGT being the surface elevation height above sea level.
No general rule has been applied to provide the correct value of δqv used to determine MLD.It can be determined by the namelist parameters zmlagen_dqv for δqv (default value 0.1) and zmlagen_dtheta for δθ (default value 1.5 K).Comparison of this implementation with the zmla directly provided by WRF's Mellor-Yamada-Nakanishi-Niino Level 2.5 PBL scheme (MYNN2.5 Nakanishi and Niino, 2006) is shown in Fig. 9.In general the generic estimation produces a higher PBL (panel a) with lower values during night (panel b).Spatial distributions between the two diagnostics are similar.

Convective diagnostics
The diagnostics related to convective activity are convective available potential energy (CAPE), which accounts for all the energy that might be released convectively, convective inhibition (CIN), which accounts for processes that inhibit the convection, the height of the level of free convection (ZLFC), the pressure at the level of free convection (PLFC), and the lifted index (LI), which accounts for the temperature difference between the environmental temperature at some higher level in the troposphere and the temperature that a parcel would have if adiabatically lifted at that level.CORDEX requires these values as statistics between output times (9freq in this case).
Since v3.6 of WRF, these variables can already be calculated with the module module_diag_afwa.F via the Buoyancy function.In this version of the module, this is the only available implementation.These vertically integrated diagnostics have a high computational cost.In order to minimize it, they are only computed at the output time step by default.However, if a user requires them as statistics (such as capemin, capemax, capemean), then these diagnostics are computed at all time steps.This behavior of the module is regulated by the namelist parameter convxtrm_diag (the default value is 0, meaning no computation) and by setting the pre-compilation flag CDXWRF to 1 and performing some complementary modifications in the module's registry file registry.cordex.See Sect.2.3 for more detail.

Additional variables
Some variables not required by CORDEX but that may be interesting and useful to the community for a wide variety of purposes have also been added.These variables will be obtained if the pre-compilation flag CDXWRF is set to 2 and some additional modifications are made in the module's registry file registry.cordex.See Sect.2.3 for more details.

tds: dew point temperature
The dew point temperature (cooler temperature at which air would saturate due to its current moisture content) is calculated following the August-Roche-Magnus approximation as shown in Eq. ( 28): where T 2 is 2 m temperature (K), hurs is 2 m relative humidity (%, previously computed), b = 17.625, and c = 243.04.This variable is provided as a statistic for the minimum, maximum, and mean in the output.

Atmospheric water budget
The water budget accounts for all the dynamics of the water in the atmosphere.This budget is divided into different terms (dynamical and source and/or sink) accounting for the total mass of water.It can be computed independently for each water species.The equation for any given water species is as given in Eq. ( 29): TEN q = HOR q + VER q + MP q (29) ∂q q ∂t = −V h ∇q q − w ∂q q ∂z + SO q − SI q , where q stands for one of the six water species (vapor, snow, ice, rain, liquid, graupel, or hail) concentrations (kg kg −1 ), V h stands for horizontal wind speed (m s −1 ), w stands for the vertical wind speed (m s −1 ), and MP for the loss or gain of water due to cloud microphysical processes.The term on the left-hand side of the equation represents the water species tendency (TEN or "PW"), referring to the difference between q at the model's previous time step and at the actual time step, divided by the time step.TEN equals the horizontal advection (HOR or F , first term on the right-hand side of the equation), the vertical advection (VER or Z, second term on the right-hand side), and the source (SO) or sink (SI) of atmospheric water due to MP.All terms are expressed in kg kg −1 s −1 .However, SO and SI cannot be provided because they are microphysics dependent and this makes it difficult to provide a generic formula for them.
In order to obtain the total column mass of water due to each term (in millimeters), an integration following Eq.( 30) is applied to each term of Eq. ( 29) (similarly as in Eq. 21).
Following the methodology of Huang et al. (2014) and Yang et al. (2011), Fita andFlaounas (2018) implemented a new module in WRF in order to allow for the computation of the water budget terms during model integration.This implementation is provided with the CORDEX module, but these variables are only provided as temporal accumulations (within 9freq) and vertical integrations in two forms: total column values and divided by the same layers as the cloud diagnostics (low, medium, high).The accumulation of diabatic heating from the microphysics scheme is provided as a proxy for the sink and/or source due to microphysics effects.
Preliminary results for all water species are shown in Figs. 10 and 11.Water vapor exhibits the largest values in both total tendency and horizontal advection.Dynamics of the other water species seems to be highly correlated with the presence of a storm system (lower right corner in the maps) or due to orthographic influences (the existence of the Andes mountain range can be inferred).
Figures 12 to 15 show temporal evolution and accumulated maps at a given time for all the water budget terms, decomposed for vapor (qv) and snow (qs).Accumulated maps  are grouped into vertical levels as is done with the clouds: p ≥ 68 000 Pa, 40 000 ≤ p < 68 000 Pa, p < 40 000 Pa.The largest amounts of the budget terms are mainly found in low (high) levels for water vapor (snow); temporal evolution at a given point shows the complexity of the water dynamics, with the terms compensating for each other.Figures 12 to 15 also show how the contribution to the total diabatic term is large at low levels over the ocean (showing the role of evaporation) and larger at high levels above the continent.

fogvis: visibility inside fog
Fog is one of the major causes of transportation disruption.
The horizontal resolutions of state-of-the-art CORDEX activities like FPS_Alps (3 km) open the possibility to explore phenomena such as fog, which was impossible to analyze in previous reginal climate experiments.In order to contribute to the analysis of fog phenomena, three different methods to calculate visibility have been introduced.Visibility is used to determine the presence of fog at a given moment.In order to provide a quantity with the density of the fog, only the visibility during a fog event is kept.The three methods are as follows (in italics).
-K84 [visibility_diag = 1].Visibility is computed using liquid water (QCLOUD) and ice (QICE) concentrations.Following Bergot et al. (2007), fog appears when there are liquid and/or ice water species at the lowest model level present.Visibility is computed using Eq. ( 31) as in Kunkel (1984): where QCLOUD is the liquid water (cloud) mixing ratio (kg kg −1 ), and QICE is the ice mixing ratio (kg kg −1 ).Visibility values are in kilometers.
-RUC [visibility_diag = 2].Visibility is computed using relative humidity (hur) as implemented in the RUC model (see Eq. 32 from Smirnova et al., 2000): www.geosci-model-dev.net/12/1029/2019/where hur is relative humidity (1, previously computed) and can be from the 2 m diagnostics or the first model layer.Visibility values are in kilometers.
-FRAM-L [visibility_diag = 3] (default).Visibility is computed using relative humidity (hur) after Gultepe and Milbrandt (2010, see their Eq. 33).In this work, a probabilistic approach is proposed to compute the visibility in three different bins: 95 %, 50 %, and 5 % probability to get certain visibility (for hurs > 30 %).As a matter of compromise in the module, the calculation with 50 % probability has been chosen as the preferred one.Therefore, this method provides the visibility that may occur with 50 % probability: The provided values of visibility during a fog event are the minimum, maximum, and mean values within output time steps (9freq) when fog occurred.Different choices are controlled through namelist variables: visibility_diag determines the method used to compute visibility, and fogvars determines the source of the relative humidity to be used as input in the visibility method.Users can choose between the relative humidity from the first model layer (hur) fogvars=1 (default value) or from the 2 m diagnostics (hurs) fogvars=2.Some preliminary results for an extreme fog episode in central Argentina are provided in Fig. 16.Results strongly differ among fog implementations.
The best agreement with a satellite-visible channel picture for a given time of the event is obtained when the default setting is used (FRAM-L method with hur values as input).
It is known that certain methods for calculating visibility rely on numerical adjustments to observational data taken under certain circumstances and at specific places (e.g., for FRAM-L, adjusted values come from observations from a Canadian airport).It would be desirable to provide a more generic "all places and purposes" approach (if possible).It is recommended to use this variable with care.

tfog: time of presence of fog
Fog can be diagnosed when the visibility is lower than 1 km (WMO, 2010b), and tfog accounts for the period during which the grid point has visibility lower than 1 km during 9freq (see Eq. 34): where vis is visibility (km) below 1 km and δt is the model time step (s).

Optimization
Regional climate dynamical downscaling experiments like the ones under the scope of CORDEX require long continuous simulations that consume larger amounts of HPC resources for a long period of time.Therefore, a series of tests were carried out in order to investigate the impact on the time of integration when the module is activated.The first version of the module (v1.0) was known to introduce about a 40 % decrease in time-step speed of integration (highly dependent on HPC, model configuration, and domain specifications).In order to improve model performance when the module is activated, the module was upgraded to version 1.1.Since this new version, a series of optimizations of the code and precompilation flags have been activated (CDXWRF).Following this implementation (instead of via regular WRF namelist options), two main goals were achieved: (1) the amount of variables kept in memory during model execution was reduced and (2) the number of conditions (mainly avoiding IF statements) to be checked and calculations at each execution step of integration was reduced as well.
A domain as shown in Fig. 17 has been set up to perform short runs (5 days) to check the changes in performance of the WRF model when the module is activated in its different possible configurations.In order to avoid nonhomogeneous communication among the cluster nodes (which would affect the analysis), all the simulations were executed on a single node and with the WRF model compiled with Intel and GNU Fortran compilers.Tests are performed at the HPC Fram (https://www.sigma2.no/Fram,last access: 18 March 2019) from the Norwegian academic HPC infrastructure.Fram is based on Lenovo NeXtScale nx360, constituted by the CPU types Intel E5-2683v4 2.1 GHz and Intel E7-4850v4 2.1 GHz.
The execution time is calculated as the mean elapsed time used during the 5-day model integration.The elapsed time necessary for each simulation step is available from the standard output of the model run (rsl.error.0000WRF ASCII file).In this file, WRF users can get the elapsed time for all the time steps of the model and domains of simulation.Different peaks of slower time steps (see in Fig. 18) coincide with input/output file operations, differences between day and night regimes, and when different physical schemes (mainly the radiative scheme) are activated at a given frequency (e.g., radt).For a simulation covering 5 days with a time step of 15 s, one obtains 28 800 time steps.The sample of 28 800 time steps is considered to be large enough to be representative for the mean time step of the whole simulation.Table 4 describes the different configurations and namelist options used in this performance test.The first simulation (labeled v381orig), which is used as a reference, is the simulation with the original version of the model (here version 3.8.1)without the module.The other simulations are activation of the module (setting CORDEXDIAG) without setting the pre-compilation parameter CDXWRF (labeled NOCDXWRF), with pre-compilation parameter CDXWRF=1 (CDXWRF1), with pre-compilation parameter CDXWRF=2, and computing all extra calculations (CDXWRF2).Finally, three more simulations (with CDXWRF=2) are made: without any extra calculation (CDXWRF2_00), without calculation of extra water budget terms (CDXWRF2_01), and without extremes from convection indices (CDXWRF2_10).Results might present some inconsistencies due to the fact that certain computations of diagnostics depend on the stability at each grid point, which might vary from run to run and depending on the workload of the HPC.
Results show that all the simulations (except CDXWRF1 with GCC) in which the module has been activated are slower than the simulation with the original version of the code (v381orig, < t>= 2.4248[IFORT], 3.5174[GCC] s).Simulation with version 1.3 of the module without pre-compilation flag CDXWRF (NOCDXWRF, < t>= 2.5058, 3.6486 s) is the second fastest.Simulation becomes slower when all the extra calculations are performed (CDXWRF2 < t>= 4.8296, 5.9958 s).The heaviest part of the module is related to the water budget computation (wb_diag=1), since when comparing to the simulation without extra calculations (CDXWRF2_00, < t>= 4.2038, 5.0736 s) there is an increase of only about 1, or 9 % (< t step > IFORT CDXWRF2_01 / < t step > IFORT CDXWRF2_00 , www.geosci-model-dev.net/12/1029/2019/Geosci.Model Dev., 12, 1029-1066, 2019 , of the mean time step when only statistics for extreme convective indices are activated (CDXWRF2_01, < t>= 4.2388, 5.4120 s) and 27, or 19 % (< t step > CDXWRF2_10 / < t step > CDXWRF2_00 | IFORT,GCC ), when only the water budget terms are included (CDXWRF2_10, < t>= 4.8510, 5.7534 s).Reduction of the time step for CDXWRF1 with GCC must be related to a moment during which HPC Fram experiences a period of very low working load.
These results are not conclusive (optimization of the module should be tested in other HPC resources, domains, and compilers), but they provide a first insight on how the number of variables included during integration (in the derived grid) has an important effect on model performance by reducing or increasing the required amount of memory.CDXWRF1 and CDXWRF2_00 perform the same amount of computations of diagnostics, but the mean time step in CDXWRF2_00 is almost doubled (because in the CDXWRF2_00 case all the additional variables are defined but not diagnosed).

Summary and outlook
A series of modifications have been introduced into the WRF model in order to make it more suitable for CORDEX experiments.These modifications include the creation of a new module for the model and the modification of certain sections of the code.With the module implementation, users have the option to retrieve directly from the WRF output all the Core and almost all the Tier 1 CORDEX variables.Moreover, a series of additional variables not required by CORDEX, which may be of high interest to the regional climate modeling community, have been included.Additionally, a series of "generic" variables are provided that are independent of the model setup, which should allow for more robust intercomparisons.
A WRF user participating in a CORDEX regional climate experiment will strongly benefit when activating the module presented here, since it avoids most of the complex and time-consuming post-processing efforts necessary to obtain diagnostics required by CORDEX.Furthermore, the module makes use of the WRF online interpolation to pressure levels of certain three-dimensional variables, which is an expensive computational task.In addition, statistical values (like minimum fluxes or accumulation) are computed using the actual values obtained during the model integration.Since the use of the module avoids the post-processing of the model output, there is no need to keep large amounts of additional fields, which drastically reduces the requirements of disk storage (roughly 50 % less disk storage).
In order to reduce the impact of the module during model integration, the setup of the module is managed before the compilation of the model via pre-compilation flags.This reduces the requirements of memory since the calculation of only required variables is activated.Different tests showed that the model performance is significantly improved when the code and the variables are constrained to the required ones (and not all of those available) and managed via precompilation flags (and via conditional statements from the model namelist as is usually done in the WRF model).
The module also establishes a first attempt to provide and implement generic definitions of certain diagnostics directly from regular model fields.This specific effort is intended to provide diagnostics that would not depend on the model configuration and/or the model itself, which will make intercomparisons between models more accurate, generic, and trustworthy.If there is also a certain coordination (e.g., within the CORDEX community) of the definition of these diagnostics, climate studies will benefit.Common definitions of diagnostics in a coordinated way across different modeling platforms will ensure the robustness of model intercomparisons.
There are certain variables from CORDEX Tier 1 protocol, which are not yet introduced but required.These are (1) snowmelt, (2) the daily maximum wind speed of gusts at 100 m (wsgsmax100), which is not yet introduced because of a lack of an appropriate method, and (3) intracloud, ground, and total lightning flashes (ic_lightning, cg_lightning, tot_lightning), even though a lightning scheme is already implemented in WRF.However, this implementa-tion is not appropriate for small domain patches used during parallelization.It has been found (Lluís Fita and Milagros Álvarez, CIMA personal communication, 2018) that lightning flash rates become patchy when the size of the parallel domain partitions are smaller than the actual extent of the cell.Lightning methods (Price and Rind, 1992;Wong et al., 2013) require encompassing the whole convective cell to analyze the updraft within the cell.When patches do not cover the whole cell, different values from the same cell at each parallel fraction of the domain are provided, from which spatial inconsistencies arise.
The module provides almost all the required CORDEX variables.However, users still need to perform some postprocessing of the output data in order to meet CORDEX standards, mainly the following: computation of the required different statistical values as daily, monthly and seasonal extremes (minimum, maximum, accumulations, means); and cmorization of the output understood as one file per variable, the right metadata and attributes, and general CF-compliant standard specifications.
There are some additional features that would make the module more useful for the climate regional community, which are also required by the CORDEX specifications.However, in order to prioritize the production of a basic, fully working, and useful version of the module as soon as possible, the following aspects are planned for future iterations and updates.
1. Flexibility enhancement is needed in terms of the capacity to provide a more flexible module that is easy to maintain and update with new requirements from CORDEX or WRF model updates.
2. Introduce new variables with a special focus on the implementation of more generic variable definitions 3. CF-compliant and cmorization output is needed because WRF output does not fully follow CF conventions.Thus, a huge coding effort is needed to provide fully CF-compliant output directly from it.Users still need to process the output of the model in order to provide data following all the CORDEX guidelines.Due to uncovered steps of the CF standard, a user of the WRF model still needs to concatenate files, change the names and attributes of variables, calculate temporal statistics over different periods (daily, monthly, seasonal), and provide the right time variables in order to fully reach the CF standard followed by CORDEX.However, these steps are computationally lighter and easier to perform in comparison to the computation of the different diagnostics and the vertical pressure interpolation already introduced in the module.
The incompatibility between WRF output and CF conventions can be overcome with the use of a complementary dedicated I/O library.This has been done, for example, on the RegIPSL platform (https://sourcesup.renater.fr/wiki/morcemed/Home, which uses WRF as an atmospheric model), which uses XIOS (http://forge.ipsl.jussieu.fr/ioserver, last access: 18 March 2019) libraries to manage the I/O.
The module is currently being implemented in the repository of the code in a dedicated branch.Once the module is fully implemented in the latest version of the model, and some additional tests are conducted, it might be possible that the model developing team decides to include it in the main source of the code.

First
, we provide a short description of the WRF code characteristics.The WRF model is written in Fortran 90.It is open access.It consists mainly of two parts: WPS (WRF Prepro-1032 L. Fita et al.: WRF module for CORDEX output (hurs)  to diagnose fog (not available for fogvisibility_diag = 1) 1 a Preferred by CORDEX.b No preference is specified by CORDEX.

Figure 1 .
Figure 1.Calculation of diagnostics according to the time step for each kind of variable (see tables in Appendix B).

Figure 4 .
Figure 4. Vertical distribution of cloud fraction and the different cloud layers on 9 December 2012 at 15:00 UTC at 62 • 4 38.00 S, 4 • 58 55.51 W (a): cloud fraction (cldfra, full circles with line in blue), mean total cloud fraction (clt, vertical dashed line), mean lowlevel cloud fraction (cll p ≥ 680 hPa, dark green hexagon), mean mid-level cloud fraction (clm 680 < p ≥ 440 hPa, green hexagon), and mean high-level cloud fraction (clh p < 440 hPa, clear green hexagon).Temporal evolution of cloud layers at the given point (b).Map of clt with colored topography beneath to show cloud extent (c); map of clh (d), map of clm (e), and map of cll (f).

Figure 5 .
Figure 5. Near-surface wind-gust estimates on 9 December 2012 at 15:00 UTC. 3 h maximum total wind-gust strength (wsgsmax tot , a), percentage of wsgsmax tot following the Brasseur method (wsgsmax b01 , b), percentage following the AFWA heavy precipitation implementation (wsgsmax hp , c), percentage of time steps for which a grid point got a total wind gust (d), percentage of time steps for which a grid point got wsgsmax b01 (e), and percentage due to wsgsmax hp (f).

Figure 6 .
Figure 6. 100 m wind estimates.Comparison between upper-level winds and estimation on 9 December 2012 at 15:00 UTC at 62 • 4 38.00 S, 4 • 58 55.51 W. (a) 3 h maximum eastward wind (red) at 100 m by power law (uzmax pl , star marker), Monin-Obukhov theory (uzmax mo , cross), by logarithmic law (uzmax ll , sum), 10 m wind value (uas, filled triangle), and upper-level winds (ua, filled circles with line), also for the northward component (green).Temporal evolution of wind speed (b) with all approximations and upper-level winds at the closest vertical level at 100 m (on a log y scale, z = 107.1 m on average).Maps of three estimations: power law (c), Monin-Obukhov (d), and logarithmic law (e), with the blue cross showing the point of previous figures.Vertical distribution of winds at the given point in a wind-rose-like representation (f).

Figure 7 .
Figure 7. On 9 December 2012 at 15:00 UTC at 62 • 4 38.00 S, 4 • 58 55.51 W: (a) water path (prw, vertical straight line in millimeters top x axis), vertical profile of water vapor (qv, line with full circles in kg kg −1 bottom x axis), and water path at each level (line with crosses).Map of water path (b) on 9 December 2012 at 15:00 UTC; red cross shows where the vertical accumulation is retrieved.

Figure 10 .
Figure 10.Normalized (with the spatial standard deviation of the mapped values, σ ) water budget 3 h accumulated vertically integrated total tendency PW at a given time, for water vapor (qv, a), cloud (qc, b), rain (qr, c), water condensed species (qc+qr, d), snow (qs, e), ice (qi, f), water solid species (qs+qi+qg, g), graupel (qg, h), and hail (qh, i).Numbers in the lower left corner of the panels correspond to the standard deviation (σ in millimeters) value used for the normalization.

Figure 11 .Figure 12 .
Figure11.As in Fig.10, but for water budget 3 h accumulated vertically integrated horizontal advection F at a given time.

Figure 13 .
Figure13.Water vapor budget maps of each component and diabatic heating from microphysics at a given time and the percentage contribution at each different vertically integrated layer with respect to the total.Total tendency PW (∂ t qv, first column), horizontal advection F (adv h qv, second column), vertical advection Z (adv z qv, third column), and diabatic heating from microphysics (Q d , fourth column).Percentage contribution of high-level (p < 440 hPa) integration to the total (second row), percentage for mid-level (680 > p ≥ 440 hPa) integration to the total (third row), and percentage of low-level (p ≥ 680 hPa) integration (bottom row).

Figure 14 .
Figure14.The same as in Fig.12, but for snow.

Figure 15 .
Figure15.The same as in Fig.13, but for snow.

Figure 17 .
Figure 17.Two-nested domain WRF3.8.1 configuration in which different performance tests were carried out.

Figure 18 .
Figure 18.Elapsed times for each individual time-step integration on nested domain d02 (time steps from number 3800, simulating date 1 January 2014 15:35:00 UTC, to 4050 for 1 January 2014 16:37:30 UTC).Model was run with different module configurations.See text for more details.Larger time steps are related to activation of the shortwave and longwave radiation scheme (every 5 min).For WRF compilation using IFORT (a) and GCC compilers (b).

Table 1 .
Setup parameters of the module_diag_cordex module for the WRF namelist contained in the cordex section.See the sections on variables for more details on the meaning of each methodology.The methodologies preferred by CORDEX are marked by a , and the ones without preference by CORDEX are marked by b ; in these cases, users can select the method according to their experience.

Table 2 .
Description of CORDEX additional pressure-interpolated variables provided with the module.

Table 3 .
Mixing ratio associated with column-integrated variables.

Table 4 .
Mean elapsed time step of 5-day simulation and difference in time with respect to the original version of the code (v381orig) for different model configurations.See text for more details.

Table A2 .
As in TableA1, but for the additional variables requested by the CORDEX FPS_Alps experiment.

Table B1 .
Description of CORDEX Core variables provided with the module."Kind" specifies when the variable is computed; a: computed all time steps, o: only at output time, t: according to a frequency in the namelist, s: statistic value from internal integration values and initialized after each output time step.
a Needed for other variables.b Unmasked to sea points.

Table B2 .
As in TableB1, but for the description of CORDEX Tier 1 variables provided with the module.

Table B3 .
As in TableB1, but for the description of additional variables provided with the module.
a Needed for other variables.b Variables will be computed if the namelist parameter output_wb is set to 1.

Table B4 .
As in TableB1, but for the description of CORDEX "instantaneous" variables provided with the module.