A regional atmosphere-ocean climate system model ( CCLMv 5 . 0 clm 7-NEMOv 3 . 3-NEMOv 3 . 6 ) over Europe including three marginal seas : on its stability and performance

A regional atmosphere-ocean climate system model (CCLMv5.0clm7NEMOv3.3-NEMOv3.6) over Europe including three marginal seas: on its stability and performance Cristina Primo1, Fanni D. Kelemen1, Hendrik Feldmann2, Bodo Ahrens1 1Institute for Atmospheric and Environmental Sciences, Goethe University, Frankfurt am Main, Germany, 5 2Institute of Meteorology and Climate Research, Karlsruhe Institute of Technology, Karlsruhe, Germany, Correspondence to: Cristina Primo (primoram@iau.uni-frankfurt.de)

1 Introduction 20 Regional climate affects directly human lives and the socio-economic conditions. The natural variability of the climate system impacts local weather. Due to the recent changes in the frequency and intensity of local extreme events (Tebaldi et al., 2006;Hartmann et al., 2013;Casanueva et al., 2014), like storms or heavy rainfall, we aim at a better understanding of the climate system dynamics. The main components of the Earth climate system are the atmosphere, land, ocean and rivers. To have a better representation of the interactions between the atmosphere and the rest of components of the Earth climate system, it 25 would be necessary to couple models representing all components. However, this is highly complex since it requires combining different numerical models, what may not only bring instabilities, but it also implies high computational costs, etc.. Therefore, current coupled climate systems focus only on a reduced number of these components. Since the oceans are the main boundary of the atmosphere (they cover 71% of the Earth's surface) and their critical role regulating energy flows (they have an enormous heat storage and transport capacity), coupled ocean-atmosphere models have been developed to better understand the interactions between ocean and atmosphere. For example, the WCRP's Working Group on Coupled Modelling (WGCM) established the Coupled Model Intercomparison Project (CMIP) as a standardized experimental protocol for studying the output of coupled atmosphere-ocean general circulation models (AOGCMs) [https://cmip.llnl.gov/index.html]. However, the coarse-resolution of these models does not resolve important physical processes that take place at local and regional scales and 5 that are relevant to understand extreme events like warming and precipitation trends changes. For example, marginal seas are not well represented in general circulation models (Somot et al., 2008;Li et al., 2006). In addition, it has also been demonstrated that the simulated Sea Surface Temperature (SST) has a large spread when comparing an ensemble of AOGCMs (Dommenget, 2012) and that GCM simulations tend to underestimate the high precipitation intensities (Sun et al., 2006). On the other hand, there are very high-resolution process-oriented models, like those used to forecast fog or winter storms (e.g. the Weather 10 Research and Forecasting model, WRF, High Resolution Limited Area Models, HIRLAM, or the High-Resolution Window Forecast System, HIRESW) that resolve specific smaller-scale physical processes, but the computational cost is unaffordable to run a long simulation or they miss interactive coupling with some climate system compartments (especially the marginal seas). Therefore, Regional Climate Systems Models (RCSMs) present as an appropriate tool to improve the spatial scale compared to global models, but keeping an affordable computational cost compared to high-resolution process-oriented 15 models.
Within the European region, different atmosphere-ocean-ice coupled RCSMs have been already run for shorter periods (a few decades). For example, Schrum et al. (2003) coupled the regional model REMO (Jacob and Podzun, 1997) and the ocean model HAMSOM (Schrum, 1997) to analyse the North and Baltic Seas, showing the improvements compared to running the uncoupled HAMSOM version. Pham et al. (2014) coupled the regional model COSMO-CLM (Rockel et al., 2008) to the ocean 20 model NEMO (Madec, 2011) for the Baltic and North Seas to evaluate the impact of these seas on the climate of Europe. They showed that the 2m air temperature high biases presented when compared to observations were of the same magnitude as other COSMO-CLM studies and smaller than for the uncoupled version. Sevault et al. (2014) described and evaluated a fully coupled regional climate system model (CNRM-RCSM4) dedicated to study the Mediterranean climate variability over the period 1980 to 2012, showing a good agreement between the model and observations (e.g seasonal cycle and the interannual variability of 25 SST, sea level, water budget, etc.). In a recent study, Obermann et al. (2018) coupled CCLM with the NEMO setup for the Mediterranean (NEMO-MED12) over the Med-CORDEX domain with ERA-Interim as the driving data. They showed that the coupled system was mostly able to simulate Mistral and Tramontane events with smaller biases than ERA-Interim. Akhtar et al. (2017) used that system to show the impact of the horizontal grid resolution and the dynamic ocean coupling of the NEMO-MED in climate simulations with the COSMO-CLM during the period from 1979 to 2009. However, all these studies 30 focus only on a few decades and extreme events have long return periods, so long term simulations are more appropriate to better represent and analyse them. So far, no long-term simulation of more than one hundred years with a regional climate coupled system is available. Hence, one of the goals of this work is to fill this gap.
Our aim is to improve our understanding about the regional climate change in Europe and what is the added value by coupling three marginal seas (the Mediterranean, the North and the Baltic Seas). Therefore, this work presents an atmosphere-ocean RCSM over Europe with an atmospheric horizontal grid resolution of about 25km, and tests its stability and performance with a simulation of more than one hundred years. The added value of the coupling is analysed by comparing our simulation with a centennial atmosphere-only model run. A description about the extra costs due to the coupling compared to an atmosphere-5 only system is also included. We have particular interest in better understanding changes on extreme events, like heat/cold waves and extreme precipitation, therefore, special focus is placed to analyse the performance of the system representing extremes compared to the atmosphere-only model.
The paper is structured as follows: Section 2 presents the regional climate system models used in this work, namely an atmosphere-only model and an atmosphere-ocean coupled model. Section 3 presents the methods and reference data used to 10 show the stability and performance of the coupled model. Section 4 evaluates the models, distinguishing the impact of the coupling over the ocean and the European continent. Special attention is given to describe the evolution of climate change indices during the last century. Finally, Section 5 includes a summary with the main conclusions of the study.

Regional climate system models
This work presents an atmosphere-ocean coupled RCSM and compares it with an atmosphere-only version. This section 15 describes the details about the different components of the RCSM: the atmospheric model, the ocean model, their set-up (lateral and boundary conditions) and how the coupling in the atmosphere-ocean system was done.

Atmospheric model
The three-dimensional non-hydrostatic limited-area atmospheric prediction model COSMO of the German Weather Service has a climate version, the COSMO-CLM (CCLM; Rockel et al., 2008). This land-atmosphere regional climate model is based 20 on primitive equations and accounts a variety of physical processes by parametrization schemes (see Doms et al., 2011). In our experiment, we used the Tegen et al. (1997) aerosol climatology, the Ritter and Geleyn (1992) radiation scheme, a turbulent kinetic energy (TKE) scheme for vertical turbulence (Raschendorfer, 2001), a reduced one-moment cloud scheme following Seifert and Beheng (2001) and a convection parameterization following Tiedtke (1989). COSMO-CLM includes the soil and vegetation model TERRA, that provides soil temperature and water content (Schrodin and Heise, 2002). The atmospheric 25 model version used in this study is the CCLM v5.0 clm7 with a numerical time step of 150 seconds and with a third order Runge-Kutta numerical integration scheme. A sub-grid scale sea ice mask was implemented in the CCLM coupled configuration over the North and Baltic Seas to have a better representation of the sea ice by accounting for partially sea ice covered grid boxes.
In this study's set up, the atmospheric lateral and top boundary condition were provided by a simulation with the earth system 30 model of the Max-Plank Institute (MPI-ESM, version 6.1, Stevens et al., 2013). This MPI-ESM simulation was nudged (via ocean temperature and salinity) to a simulation with MPI-ESM's ocean component MPIOM (Jungclaus et al., 2013) which was forced by NOAA's atmospheric 20 th Century Reanalysis (20CRv2, Compo et al., 2011) as described in Müller et al. (2015. Müller et al. ran three members, and we considered the first one (as20ncep08_r1i1p1-LR). This indirect nesting of CCLM into the 20 th Century Reanalysis was necessary because of need of consistent lateral boundary data for forcing the marginal seas in the RCSM. 5 This work compares the atmosphere-only CCLM model with an atmosphere-ocean RCSM. In the coupled version, the prescribed SST of the CCLM over the regional oceans, as well as the fraction of sea ice in the Baltic and North Seas, were replaced by the SST and sea ice fraction as simulated by coupled marginal ocean models presented in the following section, whereas the ocean models received information from the atmospheric model about the momentum and freshwater (evaporation minus precipitation), winds, solar energy and non-solar heat flux. The MPI-ESM simulation drove both, the atmosphere-only 10 and the atmosphere-ocean RCSMs. Hence, the SST of the atmosphere-only system was prescribed with the nudged MPI-ESM SST. There was no tuning in the coupled version, thus the configuration of the atmospheric model was the same in the coupled and uncoupled versions.
Within the Coordinated Regional Downscaling Experiment (CORDEX), a choice of different domains covering the land around the world were defined. Our study aimed to better understand the regional climate of central Europe, therefore, our 15 simulations applied the so-called EURO-CORDEX domain (http://www.cordex.org/domains/cordex-region-euro-cordex/, see Figure 1 for a representation), with a horizontal grid-spacing of 0.22° x 0.22° (~25km, 226x232 = 52432 grid points) and 40 vertical levels.

Ocean model
The Nucleus for European Modelling of the Ocean (NEMO) is a flexible tool for studying the interactions of the ocean with 20 the atmosphere over a wide range of space and time scales. Within NEMO, the ocean is interfaced with a sea-ice model (LIM or CICE), passive tracer and biogeochemical models (TOP). High-resolution configurations are available for the regional oceans in the European domain. For example, Beuvier et al. (2012) developed MED12, a regional version of the NEMO ocean engine on the Mediterranean Sea. In our simulation we used NEMO-MED12, based on NEMO version 3.6, with a resolution of 1/12° (~0.083°~9km, 264x567 = 149688 grid points), 75 vertical levels and with a numerical time step of 720s. The initial 25 conditions for three-dimensional potential temperature and salinity were provided by the MEDATLAS-II (Rixen, 2012) mean monthly climatology  in the Mediterranean Sea. The sea model was spun up in coupled mode during 20-years driven by randomly resampled MPI-ESM years in the period 1900-1910. The Black Sea and river runoff water input were prescribed from the climatological average of interannual data from Ludwig et al. (2009). Water exchange of the in good approximation closed Mediterranean ocean basin with the Atlantic Ocean was relaxed to the Levitus et al. (2005) climatology 30 prescribed in the buffer zone. structure, of the Baltic and North Sea basins. This is the so-called NEMO-NORDIC (Hordoir et al., 2018), whose ocean component is coupled to the sea ice model LIM3 (Vancoppenolle et al., 2009). In our study we used a NEMO-NORDIC version based on NEMO 3.3 (Dieterich et al., 2019;Gröger et al., 2019), including the LIM3 sea ice model, with a resolution 2' (~0.03°~3km, 523x619 = 323737 grid points), 56 vertical levels and a numerical time step of 180s. The initial conditions 5 for three-dimensional potential temperature and salinity were provided by Janssen et al. (1999) and further balanced by a spinup simulation of the period 1900-1905. The lateral boundary conditions in the North Sea were derived from the MPI-ESM simulation. Freshwater river inflow was provided from daily time series of the E-HYPE model output (Lindström et al., 2010).
Neither in the NEMO-MED12 nor in the NEMO-NORDIC simulation any drift in the surface variables SST and sea surface salinity following balanced initialization was detectable. Figure 1 presents the domains where the NEMO-MED and NEMO-10 NORDIC models run.

Coupling
In the atmosphere-ocean climate system, the atmospheric model CCLM was coupled with two configurations of NEMO: one adapted to the Mediterranean Sea (NEMO-MED12) and one to the Baltic/North Seas (NEMO-NORDIC). The coupling was done every three hours through a fully parallel communication between parallel models executed via the Model Coupling 15 Toolkit library (MCT;Jacob et al., 2005), named the OASIS3 Model Coupling Toolkit (OASIS3- MCT;Craig et al., 2017), since this library has already been successfully used to couple the CCLM model with NEMO (Will et al., 2017). This is an interface included in CCLM based on the Message Passing Interface (MPI). It has been proved that including this library significantly improves the performance over the previous version OASIS3, because the bottleneck due to the sequential separate coupler is entirely removed (Gasper et al., 2014). During the coupling, the data on the ocean coupling grids were 20 interpolated to the CCLM grid. At runtime, all CCLM ocean grid points located inside the interpolated area were filled with values interpolated from the ocean model and all CCLM ocean grid points located outside the interpolated area were filled with the same external forcing data as the uncoupled system. The coupled set up consisted of CCLM sending information to NEMO about the solar energy, non-solar heat, momentum and freshwater fluxes, whereas it received SST from NEMO. In addition, CCLM sent the sea level pressure to NEMO-NORDIC and received the sea ice fraction. A more detailed description 25 of the coupling strategy and its implementation can be found in Will et al. (2017) and Akhtar et al. (2019).
In addition, OASIS3-MCT offers a performance analysis tool, the LUCIA tool (Maisonnave and Caubel, 2014), that measures how much time each system component spends doing its own calculations (incl. send and receive operations, as well as time needed for interpolation of fields) and how much time it waits for information coming from the other components. This tool allows an optimization of computing resources and of the scaling of each model of the coupled system. We used the LUCIA 30 tool to find an optimal distribution of the available number of cores used for the computation, having in mind that the model with the highest number of grid points of our system is the NEMO-NORDIC (8.6 times more grid points than CCLM). Figure   2 shows an example of a configuration using 11 nodes with 36 CPUs each of MISTRAL, the High-Performance Computing system for Earth system research (HLRE3) at the German High-Performance Computing Centre for Climate and Earth System Research, Germany. We assigned three nodes to CCLM, seven to NEMO-NORDIC and one to NEMO-MED. Therefore, we assigned 3.6 times more compute resources to the coupled system than to the non-coupled system. Like this, only NEMO-MED had to wait for the exchange of the other models, while the other two models required about similar times for the calculations. Figure 2 refers to the time used to send/receive operations and interpolation. To have a broader picture about the 5 costs due to the coupling, we calculated how long did it take to run just one day (saving the same list of CCLM variables) considering two different alternatives: (a) assuming that the number of compute resources was fixed and (b) assuming that more compute resources could be used for the coupling. In the first case, both coupled and uncoupled simulations ran in 11 nodes (CCLM ran in 3 nodes in the coupled system). Like this, the coupled simulation ran in around five minutes whereas the uncoupled in around one minute. In the second case, CCLM ran in 3 nodes for both coupled and uncoupled simulations. In 10 this case, it took around two minutes to run the uncoupled system. Therefore, for this example, the coupled system was around 5 times slower given the same number of available nodes, but around 2.5 times slower when more resources were used.
For the centennial simulation, we used 576 CPUs optimally distributed as follows: 24x13 CPUs were assigned to CCLM, 12x8 to NEMO-MED12 and 14x12 to NEMO-NORDIC. With this configuration, each simulated month required around one and a half hours in total, what implied 78 days to run the complete centennial simulation (110 years). To obtain the optimized 15 computational performance of a coupled system, Will et al. (2017) show that the coupling method plays a higher role compared to the computing architecture or on the individual model components.

Methods and reference data
Our aim was to test whether the coupled system runs stable over the whole century and whether the coupled simulation including the hydrosphere component represented by the Mediterranean Sea and the North and Baltic Seas improves not only 20 the global MPI-ESM-LM simulations, but performs at least as well as the atmosphere-only model.

Methods
The stability of the coupled atmosphere-ocean RCSM was tested with a spatio-temporal analysis of a centennial simulation ). The analysis consisted of a study of the temporal series evolution, annual cycles, spatio-temporal density distributions and spatial patterns of three variables of interest: sea surface temperature, the 2m air temperature and the total 25 precipitation. Results were compared to the same analysis obtained with an atmosphere-only (CCLM) version simulation over the same period, run within the national research project on climate prediction MiKlip ("Mittelfristige Klimaprognosen", Marotzke et al., 2016). The temporal series analysis helped us to detect any bias or drift of the atmosphere-ocean simulation compared to the atmosphere-only simulation, and the spatial analysis to detect if the system behaves differently according to the area of interest. 30 Regarding the quality of the coupled model, we compared our simulation with different reference datasets (see next section for more details). Rather than a point-by-point comparison with the reference data, we would like to know if the system represents well the reference's value distributions. For this purpose, we compared the density distributions and box-plots of our system with those obtained from observational datasets. We analysed the marginal seas separately, distinguishing also among seasons. Regarding the land, different relevant areas have been used in the literature for regional climate studies over Europe, e.g. within the European project PRUDENCE (Prediction of Regional scenarios and Uncertainties for Defining EuropeaN Climate change Risks and Effects; Christensen, 2005) eight regions were defined: British Isles, Iberian Peninsula, 5 France, Mid-Europe, Scandinavia, Alps, Mediterranean and Eastern Europe. Since we aim to improve our understanding of the regional climate in Germany, we showed results in the Mid-Europe PRUDENCE region.
We are also interested in high-impact phenomena: heavy precipitation, dry spells and heat waves. The joint CCl/CLIVAR/JCOMM Expert Team (ET) on Climate Change Detection and Indices (ETCCDI) suggested a list of 27 core climate change indices based on daily temperature values and daily precipitation amounts (Karl et al., 1999;Zhang et al., 10 2011 Climate Change Canada, that in addition to the computation of the indices, also provides simple quality control of the daily input data. We also analysed how the distributions of these indices were represented compared to the distribution of the indices 15 obtained with the observed dataset.

Reference data
Two centennial reference datasets were available: the gridded Climatic Research Unit (CRU) observation Time-series (TS) produced at the University of East Anglia for the period January 1901 -December 2016 (Harris et al., 2014), which consists of monthly data at high-resolution (0.5°x0.5°) grids. In this work, version 4.01 data (CRU, 2017) is used. Our simulations had 20 a higher resolution (0.22°x0.22°), therefore a necessary upscaling prevents us from validating the high-resolution information available in the model when comparing to CRU. However, there is no available higher resolution gridded dataset covering the complete century. If we wanted to compare model data with gridded observations with similar spatial resolution, we would have to consider shorter periods. For example, the gridded data E-OBS dataset (Haylock et al., 2008) is available on a spatial resolution 0.22°. However, it covers only half of our period of interest (from 1950 onwards). It is worthy to remark that in any 25 case none of these observational datasets are perfect, and that they also differ from each other. For example, Fig. 3 shows a comparison of the monthly mean in January 1995, when a flood event happened over Germany. The figure illustrates the information loss regarding the event through upscaling compared to the 0.22° resolution. This fact can penalize our system when comparing it with CRU, especially for the first half of the century, in which E-OBS data are not available. Nevertheless, this will not affect the coupled and atmosphere-only model inter-comparison. 30 To compare our coupled data with centennial observations with higher quality, we considered historical daily station observations. The Climate Data Center (CDC) of the German national weather service "Deutscher Wetterdienst" (DWD) provides free access to quality-controlled observations of DWD climate stations (DWD-CDC, 2017). We took nine stations with less than 15% of missing values, covering the complete period , and well distributed over Germany. Figure   1 shows two of these stations, with no data gap, and located at two different altitudes and distances to the seas: Potsdam (circle, altitude: 81m) and Hohenpeißenberg (triangle, altitude: 977m). For the sake of brevity, this work presents a comparison of the RCSMs only for these two stations, but similar conclusions were reached with the other seven stations.
Over the ocean, unfortunately, there is no high resolution observed data set for the complete period. Hence, we used the Hadley 5 Centre sea Ice and Sea Surface Temperature data set (HadISST;Rayner et al. 2003)   ). We used these data even though it only covers a few decades, because the sea surface temperature of the CCLM atmosphere-only simulation is not independent from the HadISST observations (they 10 were used to obtained the MPI-ESM driving simulation). Besides the observations, we also compared the coupled system over the marginal seas with a multi-model ensemble consisting of the first member ('r1i1p1') of eight CMIP5 models.

Evaluation results
We based our analyses on three variables: the sea surface temperature, the 2m air temperature and the total precipitation. The behaviour over ocean and land is presented separately. We name the atmosphere-only model (CCLM) uncoupled and the 15 atmosphere-ocean model (CCLM-NEMO) coupled.

Sea surface temperature
We analysed the temporal evolution of the SST over the marginal seas (Mediterranean and Baltic/North Seas) to see if there is any drift or evolving bias in the coupled system over the ocean (Figure 4). The SST of the coupled version used the simulated NEMO SST, whereas the SST of the uncoupled version was from the global system MPI-ESM. The long-term SST time series 20 of our coupled system shows a stable system, although the annual mean SST values are colder than the observations (HadISST) and also than in the uncoupled system (from the global system) in both basins. The global system MPI-ESM-LM simulation is not independent from the HadISST observations, therefore, additionally the NOAA OISSTv2 was also included in the comparison for the available last three decades . Despite the cold bias, the regional coupled system follows the evolution of the observed SST values. In the Mediterranean, it even matches the ensemble mean of the CMIP5 global 25 simulations, and in the Baltic, it is within the spread of this ensemble. Therefore, the SST values from the coupled system have at least as good quality as the values of an average global model, with the advantage of having higher resolution, which preferably improves the model results especially in the land/sea transition zone. Improving the quality of the averaged global model was expected since the global circulation models contain ocean models that are not well suited to shelf seas like the Baltic and North seas. 30 Figure 5 shows a good representation of SST's annual cycle for both basins. In the Mediterranean, the coupled system is colder in winter and warmer in summer than observations and global simulation. In the Baltic and North Sea basin, the coupled system is colder throughout the year.
The density histograms of the three marginal seas summarize both the spatial and the temporal distribution of the SST values ( Figure 6). Since the Baltic and North Sea have different climatologies in winter due to the presence of ice (the Baltic Sea is 5 colder and less salty than the North Sea), we have analysed their histograms separately. In the Mediterranean Sea, the distribution of model data and observations have similar shape, namely a double maximum representing summer and winter temperatures. Comparing the modelled and the observed histograms, both the coupled and uncoupled models capture the main aspects of the SST distribution. The regional coupled simulation has a wider distribution than the observations, which is made up by a well-fitting upper range and a shift towards cooler temperatures at the lower range. In comparison with the uncoupled 10 version, the coupled system represents better the upper extremes but has a colder bias in the lower tail. The distribution shape of the uncoupled dataset is very similar to the observation's shape, which was expected with the forcing SSTs constrained by the 20CR reanalysis and thus the observed SSTs. Still, the uncoupled model has a cold bias in both tails.
The SST of the coupled version in the Baltic and North Seas comes from the ocean NEMO-NORDIC model, that includes the sea ice and the freezing/melting processes via a sea ice model. This leads to an improvement in the lower tail of the distribution 15 over the Baltic, compared to the uncoupled system that provides much colder temperatures. Both systems have a cold bias in the upper tail. Regarding the North Sea, the coupled model shows a colder bias compared to the uncoupled model.
The spatial distribution of the regional coupled system's SST bias shows that the modelled seas, as previously seen, are cooler than observations in all seasons (Figure 7, spring and autumn are not shown). Nevertheless, during winter the basin of the Baltic Sea has a warm bias and during summer the Mediterranean Sea has a gradient in the bias field from south to north. 20 Explaining the SST bias of the coupled system compared to the uncoupled SST, is not straight forward and it is out of the scope of this study. Many factors may have an impact in the coupled SST (internal dynamics of NEMO, salinity changes, initialization of the ocean, deeper mixing layer depth, etc.). Nonetheless, given that the coupled system was not retuned, the results of the transient RCSM simulation are promising.

2m air Temperature 25
This section analyses how the coupled system propagates the interactive SST information into the atmosphere and over land, in particular the impact on 2m air temperatures. Figure 8 shows the differences of the 2m air temperature monthly mean between the coupled and uncoupled systems averaged during winter (a) and summer (b) for the period 1901-2009. The plots show differences up to almost 2.4°C. The coupled system gives colder temperatures over the Mediterranean Sea during winter, and warmer temperatures during summer, with the exception of the French coast and north-east coast (regions influenced by 30 cold wind systems like the Mistral and the Meltemi). Regarding the Baltic and North Seas, the summer and winter difference patterns are similar. The coupled model provides colder temperatures over the North Sea and western parts of the Baltic Sea, whereas warmer temperatures over the north and eastern parts of the Baltic Sea. The boxplots (c) represent the distribution over the marginal seas separately, Baltic-North and Mediterranean Seas, as well as over the land, for each season, over the whole period. The plot shows that the spread of the differences on the Baltic-North is similar during the year, and the coupled system is mainly colder. The highest spread of the differences happens in the Mediterranean in summer, where the coupled version is warmer. In winter the spread is smaller and the coupled system is colder. Regarding the land, in summer the median is about zero and there is very small spread, showing mainly no difference between the systems. However, there are some 5 outliers where the coupled system shows higher temperatures (with up to 2°C difference). In winter the differences are slightly more noticeable, and the main outliers are negative, showing that the coupled simulation allows for colder temperatures. Figure 9 shows the temporal evolution of the annual 2m air temperature averaged over the marginal seas. The figure shows that both, the coupled and uncoupled systems, represent a similar positive trend and strongly intercorrelated time series. To better understand how the 2m air temperature of the coupled system responds to changes in the SST, Kelemen et al. (2019)  10 ran a few sensitivity experiments using perturbed SST in the uncoupled system. They showed positively oriented impact of SST disturbance on 2m air temperatures. Figure 10 represents the winter and the summer precipitation differences between the coupled and the uncoupled system. The largest differences are in the Eastern Mediterranean in the winter season, with large areas with 20 to 50 mm/month less 15 precipitation in the coupled than in the uncoupled simulation. In summer, however, the coupled system gives more precipitation in most Mediterranean areas. Regarding the North Sea, the uncoupled simulation gives in general more precipitation than the coupled. The differences in the Baltic are smaller, being slightly more appreciable in summer than in winter. The precipitation differences over the seas are in concordance with the differences of 2m air temperature. Boxplots (Fig. 10c) show the monthly difference distributions over the marginal seas and land separately. The spread in the Baltic-North is higher in summer, with 20 more outliers, whereas in the Mediterranean is in winter (with generally small monthly precipitation amounts in summer), with more negative outliers (higher precipitation for the uncoupled system). The differences over land are smaller in general with large outliers in winter. The latter emerged near the Mediterranean coast, where the coupled system is drier, and in the Alpine region, where the coupled system is wetter.

Total precipitation
To better understand how the total precipitation of the coupled system responds to changes in the SST, Kelemen et al. (2019)  25 did sensitivity studies showing a higher response in total precipitation than in the 2m air temperature. They also showed an added value in the seasonal precipitation sums of the coupled system during winter over the eastern part of the domain.

Model-Observations comparison
This section studies the performance of the coupled system in Europe. For this purpose, we compared the model data (coupled and uncoupled systems) with the observed CRU dataset. Data coming from coupled and uncoupled systems were interpolated 30 to the 0.5°x0.5° CRU-grid, and only those grid points defined in all three data-sets were considered. Errors of the 2m air temperature of the coupled and uncoupled systems when compared to the CRU observations for winter and summer, the 2m air temperature distributions and the distributions of the 2m air temperature errors are shown in Figure 11. In winter, there is not clear positive or negative bias (Fig11a-b). However, in summer the coupled and uncoupled systems are colder than the CRU observations, apart from the Alpine region and the south-east area, where both systems are warmer (Fig11c-d). Boxplots show a good representation of the observed distribution (Fig11e). In winter distributions are similar and the main differences appear in summer, where the systems show slightly colder values in the low temperature range. The differences also show 5 similar distribution for coupled and uncoupled systems (Fig11f). As mentioned in Sec. 4.2, the 2m air temperature differences between coupled and not-coupled systems are below 2.5°C, however boxplots show that the differences compared to CRU are much higher, up to 10°C in winter. In summer more than 75% of the 2m air temperature given by the systems is colder than the observations. In winter there is no clear bias, and boxplots are centred around the zero value. Nevertheless, there are more extreme higher values (longer upper tail in winter showing higher temperatures for the systems compared to the observations). 10 We are interested in the impact that the coupling may have in the 2m air temperature performance over Europe, in particular, over the PRUDENCE region named Mid-Europe. Figure 12 represents boxplots corresponding to the annual cycle of the monthly 2m air temperature averaged over Mid-Europe for the 20 th century (a) and the distributions of the differences of the model values minus the CRU observations (b). Both systems show similar distributions than the CRU data in winter, however colder distributions in summer. The differences are centred around the zero value in winter and below zero in summer, that is, 15 on average the winter is better represented by the regional systems than the summer. However, the spread is much bigger in winter than in summer, that is, in those cases in which the regional systems differ from the CRU observed data, the differences are higher in winter than in summer. Compared to the complete domain (Fig. 11e), the spread of the distributions (box height) over Mid-Europe is smaller than over the whole domain, since over Mid-Europe the temperatures do not differ as much as they do, when comparing the temperatures over the southern and northern parts of the whole domain. In addition, the extreme 20 cases (points) are also milder compared to the whole domain since the summers are not as hot as in southern regions like the Iberian Peninsula or north Africa, and winters are not as cold as in northern regions. Fig. 12 also points out winter outliers. In this case, for example, the coupled system estimates better the coldest temperatures in January over Mid-Europe. This is a result that was not appreciated when comparing the whole domain. However, the boxplots do not give us detailed information about the different 2m air temperature values. To analyse in more detail the differences in the tails of the distributions, Fig. 13  25 shows the density histograms of the 2m air temperature of the coupled and uncoupled systems compared to the CRU data over Mid-Europe. Blue bars represent the uncoupled system, red bars the coupled system, white bars the CRU observations and purple the intersection. As shown in the left tail of the winter distribution, the coupled system estimates well the values around -5°C although over estimates those around -8°C. Nevertheless, both systems show a good fit in winter although a colder bias in summer. 30

Extreme events
The effect of coupling the marginal seas has been shown to be a useful tool to simulate regional climate over Europe and study extreme events, like Vb events during the period 1979-2014(Akhtar et al., 2019. In this section we will focus on the representation of climate indices during the whole 20 th century. The monthly CRU data cannot be used to analyse extreme events like heat/cold waves or dry/wet spells. Instead, observed station data provided by the German Weather Service (DWD-CDC, 2017) were considered in this study. We computed core climate change indices suggested by the ETCCDI over the 20 th century for the coupled and uncoupled systems, as well as for long term series of station data located in Germany. We chose indices having an impact on human lives, for example, Fig. 14 shows the temporal evolution of four climate change indices 5 related to extreme temperatures: annual minimum temperature TNn, annual maximum temperature TXx, information on warm spells TX90p (defined as the percentage of days when the maximum temperature is above the calendar day 90 th percentile centred on a 5-day window for the base period , and finally information on cold spells TN10p (defined as the percentage of days when the minimum temperature is below the calendar day 10 th percentile centred on a 5-day window for the base period . To compute the 10 th and 90 th percentiles for each calendar day, a bootstrap procedure was used to 10 avoid possible inhomogeneity across the in-base and out-base periods (Zhang et al., 2005). Linear trends of the indices are given too. Figure 14 shows the stable evolution of the indices in the coupled version, the capturing of the trends and the improvement of the uncoupled version for the TNn and TXx indices, especially for the higher station. The coupled system detects the increase of temperature during the century, as well as the increase of the percentage of days with maximum temperatures above the 90 th percentile, and percentage of days with minimum temperature below the 10 th percentile. 15 Figure 15 compares the distributions of these indices for the coupled and uncoupled systems against the observations based on the quantiles. The diagonal represents the perfect case, assuming that observations are perfect. The closer to the diagonal, the better the simulated statistics of the considered extremes. Lines parallel to the diagonal show similar distributions to the observed one (e.g. TNn, TXx), whereas lines not parallel show differences in the spread and tails (e.g. upper tail of the uncoupled TN10P in Potsdam and the coupled TN10P in Hohenpeißenberg). The panels show that the coupled system corrects 20 the overestimation of minimum temperatures of the uncoupled system, as well as the underestimation of the maximum temperatures. Therefore, the coupling has a positive impact with respect to extreme temperatures. Regarding the percentage of days above the 90 th percentile and below the 10 th percentile, the coupled version fits the observed distribution similarly to the uncoupled version, but improves the extreme quantiles.
Regarding precipitation indices, we focused on the following annual indices (Fig. 16): total precipitation PRCPTOT , total 25 precipitation R95p when the daily precipitation (RR) is above the 95 th percentile of precipitation on wet days in the  period, maximum length of dry spell CDD (maximum number of consecutive days with RR < 1mm), and maximum length of wet spell CWD (maximum number of consecutive days with RR ≥ 1mm). For the precipitation indices, the uncoupled system proved being in general more skilful. The coupled version overestimated the precipitation in Potsdam, but underestimated it in Hohenpeißenberg. Nevertheless, the coupled system shows a stable evolution. Figure 17 shows the precipitation indices' 30 quantile-quantile plots. The distribution of the uncoupled system's simulation show a better performance than the coupled simulation. Note that the simulated data were not bias corrected and the large evaluation uncertainties because of observational uncertainties and point-to-area comparison. In this example, lines are not as parallel to the diagonal as in Fig.15, showing wider total precipitation distributions than the observed one in Potsdam, but more localized than the total precipitation observed distribution in Hohenpeißenberg. Since all temporal series (model simulations and observations) have the same length, then observed and model percentiles represent the same number of cases. Each point of the q-q plot represents a 10% of the total number of cases. Hence, we can also analyse and direct compare the frequency of total precipitation events over a particular intensity. Let us focus on the first plot (PRCTOT in Potsdam). The number of points above a horizontal line over the intensity of interest will indicate the frequency of the estimated cases above this threshold given by the model. The number of points on 5 the right part of a perpendicular line over the intensity of interest will indicate the frequency of the observed cases above that threshold. Vertical and horizontal lines in the plot correspond to a threshold 700mm/year in Potsdam. The coupled model always estimates total precipitation above this threshold, the uncoupled model in 80% of the cases (8 points are above the horizontal line), whereas it was only observed only 20 % of the cases (only two points of the lines are on the right part of the vertical line). 10

Summary and Conclusions
To better understand how the Earth climate system evolves at local to regional scales, it is necessary to gain a better understanding of the interactions among the different components of the system. This work presents an atmosphere-ocean coupled regional climate system model (RCSM) over Europe including three marginal seas: the Mediterranean, the Baltic, and the North Seas. The coupled system was tested by evaluating a centennial simulation   The coupling was made through the OASIS3-MCT coupler. For the lateral and top boundary conditions, the regional atmosphere was forced by the global Earth System Model MPI-ESM, whose ocean was nudged to an MPI-ESM's ocean-ice 20 component simulation forced with NOAA 20 th Century Reanalysis (20CRv2).
Our aim was to know if the atmosphere-ocean coupled RCSM runs stable within one hundred years and what the cost and benefits of coupling the marginal seas are. We first analysed the computing costs (in terms of resources and time consumed) due to the coupling, showing that 3.6 times more of the resources are required to run the same period or that the coupled version is 5 times slower using the same amount of resources compared to an atmosphere-only version (only CCLM). To test 25 the stability of the system during the 20 th century, we did an analysis on three variables: sea surface temperature (SST), 2m air temperature and precipitation. Results show that the system runs stable over the whole century, with no drift nor evolving bias.
Finally, we evaluated the performance of the coupled RCSM compared to a centennial simulation of the atmosphere-only version, as well as to observations (CRU data and DWD-station observations). We cannot conclude that one system is better than the other, since the results depend on the variable, area, and season of interest, as explained below. 30 This study includes a spatiotemporal analysis of the sea surface temperature (SST) of the coupled system (provided by the NEMO ocean-model) over the Mediterranean, North and Baltic Seas, as well as a comparison with the SST of the atmosphereonly version (prescribed with the MPI-ESM SST) and SST observations (HadISST and in case of the Mediterranean Sea also OISSTv2). Results show a stable and realistic evolution of the SST over the century, with a cold bias compared to observations, but performing similar to the ensemble mean of a global atmosphere-ocean coupled ensemble system. This means that the coupled system provides SSTs on a higher resolution with the added value of preserving the spatial and temporal dynamics (the ensemble mean is not a realization of the system, but an average). The SST annual cycle is well represented with a in general larger amplitude than with the uncoupled system. In winter, the coupled system shows a cold bias in most of the 5 Mediterranean Sea and in the North Sea, whereas a warmer bias in the Baltic Sea and the western part of the Mediterranean.
In summer, it shows mostly a cold bias in the three marginal seas, except in the southern part of the Mediterranean sea, that shows a positive bias. It is not straightforward to isolate any causes of the SST biases, since many factors may affect the SST in the coupled system (e.g. internal dynamics of the ocean, mixing layer depth, ocean initialization, etc.). This is planned in future studies. Nevertheless, given that the oceans in the coupled simulation are not constrained to SST observations as in the 10 uncoupled simulation, the results shown in this manuscript are very promising.
Regarding the 2m air temperature, the biases over sea of the coupled RCSM follows the SST biases. Over land differences are smaller on average, but with larger near-coastal differences. Coupled and atmosphere-only systems show a negative bias in summer months, whereas a better representation over the winter, compared to the 2m air temperature of the CRU data.
However, even though in general these errors are smaller in winter, the most extreme errors also occur in winter. Hence, the 15 spread in differences is higher in winter than in summer. A comparison of the 2m air temperature annual cycle and the spatiotemporal density distributions within the 20 th century over the PRUDENCE area namely Mid-Europe is included to show that this behaviour occurs during the whole period.
Regarding the total precipitation, the same pattern as the 2m air temperature is shown: the higher the 2m-temperature, the more precipitation given by the systems. Thus, the coupled system provides less precipitation in the Mediterranean Sea than the 20 uncoupled system during winter, and more during summer. In the Baltic and North seas, the coupled system gives in general more precipitation than the uncoupled during both seasons. Over land, the differences are smaller than over sea, apart from those near the Mediterranean coastline in winter, where the coupled system is drier compared to the uncoupled, and in the Alpine region, where the coupled system is wetter compared to the uncoupled.
Special focus was given to the analysis of extreme events. Since this study requires higher temporal precision than the monthly 25 values provided by the CRU data, DWD station observations with daily resolution were considered. The evolution of some climate change indices was presented and discussed, showing that over Germany the coupled system is stable and improves the values of the climate change indices related to extreme temperatures compared to the uncoupled version. However, the precipitation extremes at the studied German stations was better represented in the uncoupled system.
To conclude, the centennial atmosphere-ocean coupled simulation presented in this work provides valuable information about 30 the local climate in Europe. Having such a long temporal series of a stable atmosphere-ocean coupled system, whose spatial resolution is higher than the global models, helps us to improve our knowledge of the local phenomena, especially for extreme events that have longer return periods. It has been shown that coupling the ocean improves the representation of heat and cold waves over some German stations. Our centennial run can also be used to investigate the interactions among different variables on a regional scale and to learn more about the atmospheric drivers that lead to extreme events. In addition, having in mind that the E-OBS dataset covers only from 1950 onwards and there is a lack of observations during the first half of the century, these downscaled data might help us to know more about this period, as well as to improve our knowledge about the advantages and deficiencies of our decadal predictions over Europe. Finally, the here investigated RCSM can be used to improve our knowledge about the future climate change in Europe, e.g. to simulate decadal predictions or climate projection ensembles, Sevault, F., Somot, S., Alias, A., Dubois, C., Lebeaupin-Brossier, C., Nabat, P., Adloff, F., Déqué, M., and Decharme, B.: A fully coupled Mediterranean regional climate system model: design and evaluation of the ocean component for the 1980-2012period, Tellus A, 66, 23967, doi:10.3402/tellusa.v66.23967, 2014         (d)

TN10P in Potsdam
Observed quantiles, % days Tmin < 10th percentile System quantiles, Figure 16: Temporal evolution of four climate change indices related to precipitation in two German stations: Potsdam (81m) and Hohenpeißenberg (977m). PRCPTOT represents the annual total precipitation in wet days, R95p is the annual total PRCP when the daily precipitation (RR) is above the 95 th percentile of precipitation on wet days in the 1961-1990 period, CDD is the maximum