Interactive comment on “ Hindcast regional climate simulations within EURO-CORDEX : evaluation of a WRF multi-physics ensemble ” by E .

Interesting paper on validation of small ensemble of WRF simulations performed by the different settings of the model physics, based on EuroCORDEX experiments participating groups. Well written and elaborated validation, with aim to show the importance of analysis of other than standard parameters like mean temperature and precipitation and eventually to provide methodologies for model improvement. While the validation itself is really well done and biases or problems well described, however, I would expect in addition to certain comments of processes interaction effects in radiation discussion, especially toward the temperature issues, a bit deeper discussion, which, as my opinion, again especially with respect to the radiation and temperature connections, would had to lead naturally to involvement of other than standard parameters,


Introduction
Climate models are the primary tools for investigating the response of the climate system to various forcings, making climate predictions on seasonal to decadal time scales and projections of future climate.Regional climate models (RCMs) are applied over limited-area domains with boundary conditions either from global reanalysis or global climate model output.The use of RCMs for dynamical downscaling has grown, their resolution has increased, process-descriptions have developed further, new components have been added, and coordinated ensemble experiments have become more widespread (Rummukainen 2010;Flato et al. 2013).A significant constraint in a comprehensive evaluation of regional downscaling is that available studies often employ different methods, regions, periods and observational data for evaluation.Thus, evaluation results are difficult to generalize.The Coordinated Regional Climate Downscaling Experiment (CORDEX) initiative provides a platform for a joint evaluation of model performance, along with a solid scientific basis for impact assessments and other uses of downscaled climate information (Giorgi et al. 2009).
Published work within CORDEX focusing on the European domain (EURO-CORDEX) for present climate, indicates strengths and deficiencies of the state-of-the-art modeling tools, already used to downscale the Coupled Model Intercomparison Project Phase 5 (CMIP5) global model results (Taylor et al., 2012).Kotlarski et al. (2014), in a joint evaluation based on the EURO-CORDEX RCM ensemble, reported bias ranges for temperatures and precipitation corresponding to those of the ENSEMBLES simulations (van der Linden et al.

2009
) with some improvements identified and strong influence of specific choices of model configuration on model performance.Vautard et al. (2013), focusing on the European heatwaves with the EURO-CORDEX ensemble, found that high temperatures are primarily sensitive to convection and micro-physics.Giorgi et al. (2012) highlighted the significant sensitivity of model performance on different parameterization schemes and parameter settings in a RegCM4 model study over different CORDEX domains including Europe.
These findings indicate that combining model evaluation with sensitivity studies is necessary in order to investigate recurring and persistent biases, list potential sources of their origin, dissuade/encourage modelers from using specific configurations responsible for systematic errors over specific regions and suggest tracks for model development.Since large model ensemble spreads and present climate biases are potentially linked with future climate uncertainties (Boberg and Christensen., 2012), it is important to understand contributions of individual processes on the present European climate in order to be able to interpret future climate projections with greater confidence and possibly constrain these projections (Hall and Qu 2006;Stegehuis et al., 2013).
In the current work we analyze hindcast simulations of the Weather Research and Forecasting model (WRF) multi-physics ensemble performed within the framework of EURO-CORDEX.
Recent research has demonstrated the ability to use WRF (Skamarock et al 2008) to refine global climate modeling results to higher spatial resolutions in Europe (e.g.Soares et al, 2012;Cardoso et al., 2013;Warrach-Sagi et al., 2013).The aim of this study is to identify systematic biases and areas of large uncertainties in present European climate and relate them to specific physical processes (e.g.cloud-radiation or land-atmosphere interactions).This analysis contributes towards a better understanding of WRF as a dynamical downscaling tool for RCM modeling studies and its optimization for this specific region.

Observations
To evaluate the model simulations we use daily mean, minimum and maximum temperature and precipitation values from E-OBS version 9.0 (hereafter E-OBS9) covering the area 25-75N and 40W-75E, available on a 0.44 degree rotated pole grid (Haylock et al., 2008).The Model cloudiness was validated against the well-established cloud product from ISCCP, obtained from operational sensors aboard geostationary and polar-orbiting satellites (Rossow and Schiffer, 1999).Single pixel observations in the visible (0.6mm and 1km resolution) and infrared (11mm and 1-4-km resolution depending on the instrument) spectral bands are used.Pixels appearing to be colder and/or brighter than clear sky are characterized as cloudy.Pixellevel retrievals are spatially aggregated at an equal area grid with a resolution of 280km x 280km, being available 8 times per day.The ISCCP cloud product is in good agreement to the MODIS cloud mask product (Pincus et al., 2012).
An additional, higher resolution, satellite dataset was also used for model validation, in order to confirm the robustness of the validation findings with ISCCP.Shortwave downward radiation at the surface was additionally obtained from Satellite Application Facilities for Climate Monitoring (CMSAF), which is part of the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT).The spatial resolution of the data is 0.03 o x0.03 o while the temporal resolution is 15 min.There are a total of six MFG satellites (Meteosat-2 to 7), providing SSR data from 1983 to 2005.This dataset has been validated against homogenized ground-based observations from the Global Energy Balance Archive (GEBA) (Sanchez-Lorenzo et al., 2013) and from the Baseline Surface Radiation Network (BSRN) (Posselt et al., 2012).In this study, seasonal mean solar surface radiation data from CMSAF were re-gridded to the E-OBS 0.44 o resolution in order to be compared with the WRF simulations for the time period 1990-2005.Since this dataset does not exactly overlap with the hindcast timeslice (1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008), we used the higher resolution dataset only as auxiliary material to support the major findings of the model comparison with the coarser ISCCP satellite retrievals.

Models
In this work we present EURO-CORDEX hindcast climate simulations performed with the WRF/ARW (version 3.3.1)model.The simulations cover the EURO-CORDEX domain with a resolution of 0.44 o .Some settings are common to all the simulations.The Noah Land Surface Model (NOAH) was the commonly selected land surface model (Chen et al., 1996), the Yonsei University scheme (YSU) was the chosen Planetary Boundary Layer (PBL) scheme (Hong et al., 2006) and MM5 similarity the surface layer option.All simulations were forced by the ERA-Interim reanalysis dataset (Dee et al., 2011) at 6-hourly intervals with a spatial resolution of 0.75 o .The way the forcing fields were pre-processed and implemented in the simulations (relaxation zone, method etc), the setting of vertical layering, land use databases, and sea surface temperatures were decided by each group separately.
In the current ensemble, five different WRF configurations are applied (Table 1).Three different convection schemes were used, namely the Kain-Fritsch (KF, Kain 2004), the Grell-Devenyi (GD, Grell and Devenyi, 2002) and the Betts-Miller-Janjic ensemble (BMJ, Janjic, 2000).The radiation physics options tested were: the newer version of the Rapid Radiative Transfer Model (RRTMG, Iacono et al. 2008) and the CAM scheme (Collins et al. 2004).The selected microphysics options were the WRF Single-Moment 3 and 5-class schemes (WSM3/WSM5, Hong et al., 2004) and the WRF Single-Moment 6-class schemes 6 (WSM6, Hong and Lim 2006).The number of points in relaxation zone and type of relaxation are provided in the last column of Table 1.WRF_A configuration is simulated twice with different SSTs (WRF_A and WRF_A_SST).In WRF_A_SST, the SST field was interpolated as provided in the standard 3.3.1 release (METGRID.TBL).This option results in a coarse resolution of the SSTs resulting in a strong temperature perturbation across the European coastline.In other configurations, either a finer interpolation method is used or the SST fields are replaced by skin temperature.
Five meteorological variables are evaluated, namely surface temperature, precipitation, total cloud cover, the short-and longwave downward radiation at the surface.Temperature and precipitation fields were interpolated to the 0.44 o E-OBS grid and an elevation correction (standard lapse rate of 6 o C/Km) was applied to the simulated temperature to account for the difference between E-OBS9 and model orography.Radiation and cloud data were interpolated to a common ISCCP 2.5 o grid for comparison to the satellite dataset.
In the WRF output the fractional cloud cover is available in each hybrid level.To be able to compute the total cloud cover, an assumption about the overlapping of these fractions is needed.In the present study, we post-processed the fractional cloud cover following the algorithm proposed by Sundqvist (1989).This method assumes maximum overlapping inside cloud layers and random overlapping between them, which is usually summarized as maximum/random overlapping.Radiation parameterizations make their own assumptions to compute cloud effects on radiative fluxes.The overlapping methodology of the Community Atmosphere Model (CAM) radiation parameterization is described in Collins, (2001), and it is also maximum/random overlapping.The RRTMG parameterization also uses maximum/random overlapping.Therefore, except small differences in the algorithms, the overlapping assumptions are consistent through the parameterizations used and in the postprocessor.
In particular, this spin-up allows for adjustment of the soil moisture and temperature.The seasons were averaged from June to August (JJA) and December to February (DJF).All seasonal averages were calculated based on mean monthly values.The analysis is undertaken over the whole European domain and over the following sub-regions: Alps (AL), British Isles (BI), East Europe (EA), France (FR), Mid-Europe (ME), Mediterranean (MD), Iberian Peninsula (IP) and Scandinavian Peninsula (SC).These sub-domains are described in Christensen and Christensen, 2007.Taylor diagrams are used to provide a concise statistical summary of how well observed and simulated patterns match each other in terms of their correlation R and normalized standard deviation (NSD) (Taylor, 2001).On a Taylor diagram, R and NSD are all indicated by a single point on a two-dimensional polar coordinate plot.The radial distance from the origin corresponds to NSD while the azimuthal position corresponds to R. In the Taylor diagrams the reference point is also displayed, which has R and NSD equal to one.Thus it is easy to identify locations and analysis regions for which the model performs relatively well, as they lie close to the reference point.Furthermore, in case of deviations from the reference, it is easy to distinguish between errors due to poor simulation of variance or due to incorrect phasing (low correlation).
Q-q plots compare the probability distribution of two variables, by representing on a Cartesian plane some quantiles of a variable against those of another variable or a theoretical distribution.In this work, following the methodology of Garcia-Diez et al. 2012, we compared the distribution of simulated mean temperature and precipitation (y-axis) against the observations (x-axis) dividing the probability range into 19 pieces (i.e.taking a quantile every 5%).These representations allow one to easily identify deviations in the probability distribution (as departures from a straight diagonal line), biases (as shifts), differences in the variability (as straight lines with a different slope) or asymmetries (as curved lines).
In order to test the statistical significance of differences between models and observations we calculate the quantity t (two-independent sample t-test): where X m and X o are the arithmetic means of the n = 57 monthly values for one season in the 19-year time slice; σ m and σ o are the standard deviations of the n values.The modelled and observed values are significantly different at the 95% level if t > 1.98.

Bias
The mean climatological patterns and the annual cycle of temperature are captured quite well by all model configurations, following the spatial characteristics of E-OBS9.This supports the view that major processes governing the surface temperature climatology are represented reasonably by all model configurations.Figure 1 shows the summer and winter mean surface Nearly all WRF configurations underestimate surface temperatures over the different European sub-regions for both seasons.Only the upper quantiles of JJA mean-temperature are overestimated mainly in southern Europe (MD,IP), as indicated by the q-q plots (Fig S1a).
Otherwise, the bias remains systematically negative for all configurations, with no obvious asymmetries or differences in variability, except for the behaviour of WRF-G in summer and WRF-A_SST in winter, which are discussed thoroughly in the following sections.
A large negative temperature bias over north-east Europe in winter is also indicated for maximum temperatures (-9 o C) (Fig. S2) in WRF-A_SST and is also apparent, in all other configurations.This feature is more persistent in minimum temperatures (Fig. S3) ranging Of all our WRF simulations, WRF-G has the largest cold bias in summer (-2.1 o C mean over all European sub-regions).WRF-G uses the GD convective scheme, which may explain the larger cold bias, since the other configuration using the same microphysics (WSM6) and radiation (CAM) as WRF-G, with a different convective scheme (WRF-A with KF scheme) has a smaller bias (-0.3 o C).Analysis of the short-and longwave radiation components further support this interpratation, as shown below.
In winter a negative temperature bias is apparent for all model configurations especially over the north-eastern part of Europe and as indicated by the winter mean temperature q-q-plots (Fig S1b  2014) show also in their 5-years long multi-physics EURO-CORDEX ensemble that the snow-covered European regions (Alps, and north-east Europe) overestimate the surface albedo, which may be among the sources of bias.
WRF-A_SST has an even colder bias for both seasons in comparison to WRF-A, despite using the same primary parameterizations.This disagreement can be attributed to the SST implementation (coarse resolution along the coastline).This perturbation of SSTs affects considerably the inner part of the domain in winter, by lowering the surface temperature, as indicated by additional 1-year long sensitivity studies with the WRF-A_SST modelling system ().In the 19-years hindcast simulations, this effect is not so pronounced in summer.
The southern part of the Scandinavian Peninsula, the UK and Italy are the areas with the highest temperature differences in winter.This increases the spread in these areas even more, and thus uncertainty in winter temperature, which has already been shown to be large above north-east Europe in winter.
The causal link between SSTs and land surface temperature is not easy to depict as they both may influence one another and third factors may influence both at the same time.A similar behaviour is also reported by Cattiaux et al (2011)

Bias
All models depict observed climatological features, namely the major precipitation maxima over the Alps (smaller in winter) and western Norway and the dry regions over the Mediterranean in summer (Fig S6).Precipitation is overestimated for both seasons over all subregions, except for the British Isles in winter (-5 to -15% relative bias depending on the configuration) (Table 3).The precipitation bias is larger in summer, ranging between 25 to 55% for the different model configurations, than in winter (15 to 30%).
Figure 3 shows the mean bias in precipitation for all model configurations.The difference between modelled and observed values is statistically significant for all configurations over most subregions.The models show the largest deviation from observations for summer precipitation magnitudes in the Mediterranean area, especially if the KF convective scheme is selected.Convective precipitation along the Dinaric Alps is overestimated in the WRF-C and WRF-A configurations such that the model precipitation is almost double that of the observations.The issue of unrealistically high summer convective precipitation over mountainous regions is also discussed by Torma et al., 2011 andZanis et al., 2014, indicating that the bias improves in higher resolution simulations by optimizing the convection scheme.
Higher precipitation rates (upper quantiles) are overestimated over all subregions for all model configurations (Fig. S7a).Herwehe et al. 2014 in their study over North America, also reported a large overestimation in larger summertime precipitation amounts (>2.54 cm), attributed to deep cumulus convection.This large overestimation was improved considerably when subgrid-scale cloud-radiation interaction were introduced into the WRF model in the KF convection scheme (Alapaty et al., 2012).
The lowest summer precipitation bias is noted when the GD convective scheme is used (about 25-30% on average), followed by the BMJ (about 35%).The KF scheme is related to the highest positive precipitation bias over all European sub-regions but the Scandinavian Peninsula (50-55% in summer and 20-30% in winter).Results are more comparable in winter: the most problematic area with respect to bias appears to be Eastern Europe (50-65% for different model options) while for all other European sub-regions the bias is considerably lower (20-30%).A number of WRF ensemble studies (Evans et al., 2012;Ji et al., 2014;Di Luca et al, 2014) have also reported that the cumulus along with the PBL schemes exhibit the strongest influence on precipitation.Evans et al., 2012 in a WRF ensemble study over southeast Australia, reported that the YSU PBL scheme tends to induce more convection in the KF scheme and lead to an overestimation of precipitation.
Precipitation overestimation is not an uncommon feature in WRF simulations (Garcia-Diez et al., 2014), and often becomes more pronounced at higher resolutions.This systematic error may reflect an unbalanced hydrological cycle, returning moisture from land and/or water bodies to the atmosphere too quickly.Kotlarski et al. (2014) suggest that the wintertime wet bias of WRF is closely related to the distinct negative bias of mean sea-level pressure, indicating a too high intensity of low pressure systems passing over the continent.However, some sensitivity studies performed at WRF-F using spectral nudging for upper air winds and thereby avoiding this problem, showed little changes in bias amplitude (Vautard, personal communication).Sensitivity tests conducted to test alternative choices for convective parameterizations and cloud microphysics are also usually not conclusive but generally none of the options decisively improve the general picture at higher resolutions (Bullock et al., 2014).The perturbed SSTs in the WRF-A_SST simulation result in a drier climate throughout the year.The physical reason of this colder and drier climate can be traced to the water holding capacity of the atmosphere limiting precipitation amounts in colder conditions, assuming a small change in the average relative humidity.Depending on the energetic constraints of a region and its water limitations this relation is modulated accordingly for each season and subregion (Trenberth and Shea, 2005).It should be noted, that the reduced precipitation in WRF-A_SST simulations improves considerably the precipitation bias (Table 2) to about 15% on average for both seasons.However, this is just a case of error compensation, based on the basic WRF feature of predominant overestimation in precipitation.

Temporal and spatial agreement
Following the same methodology described above for temperature, we proceed with the analysis for precipitation.The temporal Taylor plot are based on mean monthly values, thus indicating interannual variability, and are averaged over all European subregions (Fig. 5, upper panel) for precipitation shows that the average JJA temporal correlation is 0.8 for all configurations, with amplitudes of variability being close to unity for WRF-F/WRF-G (GD convection) and somewhat higher for all other configurations.The impact of the selection of convective scheme is clearly seen in the summer season but not in winter.For DJF precipitation, the metrics improve somewhat in comparison to those during the warm period (0.8<R<0.9 and σ norm ~1), therefore it seems that WRF captures better the temporal variability in winter than summer, apart from having a lower wet bias.The temporal correlation over the Alps is the lowest in the sub-regional analysis (0.3<R<0.6) and larger over the Scandinavian Peninsula (0.9 in winter and 0.6-0.8 in summer).
With respect to precipitation spatial agreement with observations (Fig 5,bottom), it seems that DJF WRF results are coherent, and that the different model parameterizations do not impact much on the average winter spatial pattern.The average spatial correlation is about 0.7 and the amplitude of variability 1.1 to 1.2.In summer results are more dispersed with spatial correlations ranging from 0.8 to 0.9 and higher amplitudes of variability (1.2 -1.5), indicating that the amplitude of JJA spatial variation is overestimated.This is a common finding among regional climate model studies, reporting summer precipitation to be mostly controlled by internal convective processes, and winter patterns most likely linked to the large-scale circulation and thus the forcing fields (e.g.Rauscher et al. 2010).On a subregional level, the highest spatial correlations are seen over the Scandinavian Peninsula and the British Isles (R=0.9) in winter and the lowest over France and Mid-Europe in summer (R=0.4).The amplitude of variability is exaggerated by all model configurations in summer (1.5<σ norm <2), with the exception of the British Isles (σ norm close to unity).

Radiation
The primary driver of latitudinal and seasonal variations in temperature is the seasonally varying pattern of incident sunlight, and a fundamental driver of the circulation of the atmosphere are the local-to-planetary scale imbalances between the shortwave (SW) and longwave (LW) radiation.The impact of the distribution of insolation on temperature can be strongly modified by the distribution of clouds and surface characteristics.In the this section we evaluate two radiation components of the WRF model simulations, namely the surface downwelling SW and LW, which are compared to available ISCCP satelliteretrievals.The comparison was also performed with the CMSAF satellite dataset, available in a higher spatial resolution, but only between 1997-2003.biases are generally anti-correlated, in such a way that regions with positive SW bias, exhibit a negative LW bias and vice versa.If the magnitude of biases were the same, then there would be a cancelling in radiation bias and a better agreement with observed temperature would be expected.However, this is not the case.
For WRF-A and WRF-C configurations using the KF convection and CAM radiation schemes there is a strong surplus in downward radiation (SWbias+LWbias >0) over central and southern Europe, leading to lower cold bias or even small warm biases in southern Europe in Precipitation overestimation is reported as a typical WRF behaviour, which remains or even worsens at higher spatial resolutions (Kotlarski et al., 2014).Our current findings are in the same line, with the KF convective scheme being related to the highest bias over the Mediterranean in summer.All ensemble members better capture winter than summer precipitation, the latter being locally rather than large-scale controlled.There is no specific configuration that totally alleviates the wet bias of WRF either here or according to literature.
This issue points, among other things, towards weaknesses ofin the convective schemes.
Different model domain configurations and datasets seemingly contribute to the precipitation spread.Our study identifies the implementation of SSTs as one important contributing factor.Unit is degree Celsius.

E-OBS9
WRF E-OBS dataset is based on the ECA&D (European Climate Assessment and Data) station dataset and other stations from different archives.Short-and longwave downwelling radiation fluxes at the surface and cloud fraction were evaluated with the International Satellite Cloud Climatology Project (ISCCP) Flux Dataset.The ISCCP radiation fluxes comprise a satellite derived product including shortwave (0.2-5 μm) and longwave (5.0-200 μm) radiation at the Earth's surface.The radiation estimates come from the synergistic use of ISCCP cloud dataset, satellite data (TOMS, TOVS and SAGE-II), models (NCEP reanalysis, GISS climate model) and climatologies of various tropospheric and stratospheric parameters (aerosols, water vapour, etc).The dataset spans from July 1983 to December 2009 having a temporal resolution of 3hr and a spatial resolution 280 km x 280 km (~2.5x2.5 o ).Zhang et al. (2004) estimated the uncertainty of the dataset at 10-15W/m 2 compared with the ERBE (Earth Radiation Budget Experiment) and (Clouds and the Earth's Radiant Energy System) CERES datasets.Since the ISCCP radiation data emerge from the use of a complete radiative transfer model from the GISS global climate model with observations of ISCCP surface, atmosphere and cloud physical properties as input, the radiation and cloud datasets are considered fully compatible.For the current analysis, seasonal averages of the ISCCP variables were calculated for the time period 1990-2008 and were compared to the WRF surface downward short-and longwave radiation, after bilinear interpolation to the 2.5x2.5 o ISCCP grid.
2m temperature bias with respect to E-OBS9 over Europe averaged over the time slice 1990-2008.Stippling indicates areas where the biases are not statistically significant; over all other regions the models and observations are significantly different at the 95% level.Table2summarizes the E-OBS9 mean seasonal averages of surface temperature over the different subregions, the absolute model bias (model-E-OBS9) of all simulations and the ERA-Interim comparison with E-OBS (ERA-Interim minus E-OBS9).The forcing fields (ERAi) are somewhat warmer (~0.5 o C) over the whole European domain compared to E-OBS9 data.
from -2 o C (WRF-F) to -13 o C (WRF-A_SST).In summer, maximum temperatures are reasonably reproduced in most configurations with biases becoming positive over central and eastern Europe.Only the WRF-G configuration exhibits the same persistant feature of strong negative bias over north Europe.Minimum temperatures in summer are relatively well reproduced, with some positive bias mostly seen in WRF-F (<3 o C).Mooney et al. 2013  in a WRF-multi physics ensemble forced by ERA-Interim, reported that summer surface temperature is mostly controlled by the selection of Land Surface Model (LSMs).In their study the NOAH and Rapid Update Cycle (RUC) LSMs were tested, and the use of NOAH yielded more accurate surface temperatures than the use of RUC, however the temperature distributions were shifted towards lower values, especially when combined with the CAM radiation scheme.Our current findings can neither support nor contradict this finding, since all models are using the NOAH LSM.We could tentatively attribute, however, the combination of the NOAH LSM along with the CAM radiation scheme, as one possible explanation contributing to the general tendency towards cold biases in the WRF-ensemble.
in a North-Atlantic SST sensitivity experiment of the fall and winter 2006/2007 with a climatic (colder) SST dataset.A similar response in land surface temperature above Europe was showcased, in which anomalous SSTs affected land temperature through the upper-air advection of heat and water vapor, interacting with radiative fluxes over the continent.This mechanism was also found to be more pronounced in autumn and winter, when SSTs anomalies and upper air advection is more efficient.3.1.2Temporal and spatial agreementWe use Taylor plots(Taylor 2001) to investigate the temporal agreement between the simulated and observed fields, i.e. the reproduction of interannual variations.With areaaveraged temperature fields, we compare time-series of spatially averaged quantities.Figure 2 (upper panel) depicts model performance averaged over the different European sub-regions, different colours depict the different WRF configurations.The overall model performance based on average monthly values, indicates very high temporal agreement with observations (0.95) and amplitude of variability higher than the observed (σ norm >1).Inspection of Taylor plots for each different European subregion ( Fig. S5), shows that the largest amplitude of variability in summer is produced by WRF-F/WRF-G and the lowest (σ norm slightly below unity) for WRF-C.The worst performance with respect to temporal correlations is found over the Alps for the winter and summer season (0.7<R<0.8) most probably due to the coarse resolution of the model set up which cannot capture accurately the topographic features of the area.The spatial agreement between observations and the models is investigated by comparing the time-averaged spatial fields i.e. two maps without a temporally varying component.The spatial agreement over the whole European domain (Figure2-bottom) is very high (0.97-0.99), confirming that the spatial representation of surface temperature is captured well.The amplitude of normalized standard deviation (σ norm ) in winter is somewhat higher than unity for all configurations.In summer results are more dispersed compared to winter, and the WRF-C configuration again gives the lowest and best (unity) σ norm .On a sub-regional level results appear to have greater spread over inner continental regions (ME,FR, EA) in comparison to coastal areas (IP,SC,MD, IB).

Figure 4
Figure 4 depicts the annual cycles of all model configurations based on mean monthly values, over the selected subregions.The shaded area corresponds to the observational standard deviation.All configurations reproduce reasonably well the basic characteristics of the seasonal cycle, such as the dry summer of southern Europe or the summer maximum over Scandinavia.All simulations have a wet bias, mostly during spring-and summertime and to a lesser extent in autumn and winter.This fact points to smaller-scale circulations and convection being a critical component to the large positive bias in precipitation.Higher correlations of the modelled with observed annual cycles are seen over the Mediterranean, the Iberian and the Scandinavian Peninsulas, despite the large positive bias.Results are more dispersed and less correlated for the Alps and the Mid-European regions.In a few cases the models have difficulty correctly capturing the seasonal cycle over France (WRF-C, WRF-G, WRF-F).
3.3.1 Downward shortwave radiation at the surfaceSeasonal average 1990-2008 downward SW radiation components from WRF and ISCCP satellite data are compared over the European domain.Satellite observations exhibit a southnorth gradient in summer, with a maximum over the Mediterranean (up to 400 W/m 2 ) and minima over northern Europe (about 200 W/m 2 on average).All model configurations exhibit this sour-north gradient, however with different characteristics: in some configurations (WRF-A/WRF-C with KF or WRF-D with BMJ convection) the SW radiation gradient is less steep towards the north compared to the satellite data, leading to a general positive SW bias over Europe except Scandinavia with a maximum over central Europe, within the range of 40-60% (Fig.6a).For WRF-F and WRF-G (GD convection) the SW radiation decreases very steeply near 40-45 o , leading to negative bias of SW radiation over north Europe.This can explain, at least partially, the larger summer negative mean temperature bias over mid-and north Europe for WRF-G and WRF-F, compared to other configurations.The SW radiation bias pattern resembles also the bias pattern of maximum surface temperature (Fig.S2a), indicating a strong dependence of maximum temperatures on the SW radiation component.For the WRF-G configuration maximum temperatures are underestimated by up to 8 o C over northern Europe, while biases in minimum temperatures are generally smaller (Fig.S3a) and less correlated with SW radiation.Interestingly, Garcia-Diez et al(2014)  showed that the negative SW radiation bias over central and north Europe in summer in the WRF-G configuration is not reproduced in a 5-year long simulation, when the model simulation restarts daily from the ERA-interim forcing fields with 12 hours of spin-up.Thus, it appears this radiation bias is related to internal physical mechanisms, and eventually feedbacks, which develop in a years-long climate simulation.As it will be shown later, the underestimation of SW downward radiation at the surface in GD convection can be linked to a 40-50% overestimation of cloudiness.In winter the observational data indicate maxima of the SW radiation values of about 160 W/m 2 over the southern part of the domain that decreases gradually towards the north.The same spatial pattern is reproduced by all model configurations; however, there is mostly a positive SW radiation bias over the domain, except the Iberian Peninsula and north European coasts of France and Benelux (Fig 6b).The positive bias increases towards the northern and eastern parts of the domain, where it reaches up to 70-80%.WRF-C, with different microphysics (WSM3) has an additional feature, of ahigher positive SW radiation bias over Mid-and East-Europe (~70%).
3.3.2Downward longwave radiation at the surface Downward LW radiation in summer is higher over southern Europe and decreases towards the north.Comparison with the ISCCP satellite data indicates a negative bias over southern Europe of about 20% -more pronounced for the KF convective scheme-becoming positive in northern Europe with larger positive bias with the GD convective scheme (10%) (Fig 7a).Comparison of Fig 6a and 7a (SW and LW components) shows that summer SW and LW scheme.The summer cold bias is even more pronounced in maximum temperatures, which are largely controlled by cloud coverage and SW radiation.The strong positive SW bias is summer in southern Europe, mostly induced by the KF or BMJ convective schemes, contributes to a lessening of the systematic cold bias of WRF.When a convective scheme does not suffer from a positive SW bias, then temperatures are grossly underestimated (in our case WRF-G configuration with GD convection).Winter surface temperatures are affected in snow-covered areas in north-east Europe, as a result of a too-strong response of temperature to snow cover.This underestimation is even more pronounced in minimum temperatures, exhibiting bias of up to -9 o C over north-east Europe in winter, and obviously sensitive to land-atmosphere interactions.The negative sign in the sum of LW+SW bias over north Europe, contributes to the cold bias problem of the region.Winter cold bias reduces with the application of RRTMG versus the CAM radiation scheme.Mind also, that ERA-Interim has a small (0.4 o C) positive bias in comparison to our reference E-OBS9 climatology.If the driving fields suffer from persistent cold bias they can deteriorate model performance even further.
Erroneously, a coarser resolution of implemented SSTs (WRF-A_SST) seemingly -corrects‖ the average WRF wet bias, by shifting the average climatology towards a colder-drier winter climate regime.Concluding, we stress the importance of such coordinated evaluation exercises, which aim to highlight systematic biases in model performance, and identify the underlying physical mechanisms.The current work concentrates only the surface components of the radiation balance and leaves other component such as top of the atmosphere, the sensible and latent heat fluxes and cloud properties for future analysis.Future analysis including these parameters is necessary for a more complete understanding of the physical mechanisms involved in the appearance of temperature and precipitation biases.This work is ongoing within the EURO-CORDEX WRF-groups.

Figure 1a
Figure 1a Mean summer 1990-2008 surface temperature bias (model-E-OBS9).Stippling indicates areas where the biases are not statistically significant.

Figure 1b
Figure 1b Mean winter 1990-2008 surface temperature bias (model-E-OBS9).Stippling indicates areas where the biases are not statistically significant.Mind the differences in colour scales.

Figure 2
Figure 2 Temporal (upper panel) and spatial (bottom panel) Taylor plots for surface temperature averaged over Europe for summer and winter 1990-2008.Upldated plot

Figure 3b
Figure 3b Mean winter 1990-2008 precipitation bias (model-E-OBS9) expressed in mm/day.Stippling indicates areas where the biases are not statistically significant.

Figure 4
Figure 4 Mean precipitation annual cycle.The grey area indicates observational standard deviation.Updated plot.

Figure 5
Figure 5 Temporal (upper panel) and spatial (bottom panel) Taylor plots for precipitation averaged over Europe for summer and winter 1990-2008.Updated plot especially the long wave component) has a large impact on winter surface temperature, the CAM option being related to greater negative bias over north and east Europe in comparison to RRTMG.Our simulations do support this finding, since WRF-D and WRF-F using the RRTMG radiation scheme exhibit the smallest bias in winter over the Cattiaux et al. 2013)on mostly concerns the lower quantiles of the distribution.This finding is not uncommon among different climate simulations including global modelling studies within CMIP5 (e.g.Cattiaux et al. 2013).Mooney et al. (2013)reported that the radiation scheme (o C) with a northeast-southwest gradient.This spatial pattern of higher uncertainty (spread) over north-east Europe has also been reported in future climate projections for winter temperature, and is related to the role of snow cover in cooling down the surface through snow albedo and snow emissivity feedbacks(Deque et al. 2007).Another issue for consideration is that the working WRF version has known problems in treating surface temperature in snow covered areas 1 .Garcia-Diez et al. (

Table 1 .
WRF configurations participating in the study.

Table 3b .
Table 2b.Same as Table 2a for winter Table 3a.Mean (Mobs) of summer (JJA) precipitation for observations (E-OBS9) over 1990-2008 and the European subregions, units in mm/day.Units of E-OBS9 in mm/day.Model Same as Table 3a for winter.