Climate model configurations of the ECMWF Integrated Forecasting System (ECMWF-IFS cycle 43r1) for HighResMIP

This paper presents atmosphere-only and coupled climate model configurations of the European Centre for Medium-Range Weather Forecasts Integrated Forecasting System (ECMWF-IFS) for different combinations of ocean and atmosphere resolution. These configurations are used to perform multi-decadal ensemble experiments following the protocols of the High Resolution Model Intercomparison Project (HighResMIP) and phase 6 of the Coupled Model Intercomparison Project (CMIP6). These experiments are used to evaluate the sensitivity of major biases in the atmosphere, ocean, and cryosphere to changes in atmosphere and ocean resolution. All configurations successfully reproduce the observed long-term trends in global mean surface temperature. Furthermore, following an adjustment to account for “drift” in the subsurface ocean, coupled configurations of ECMWFIFS realistically reproduce observation-based estimates of ocean heat content change since 1950. Climatological surface biases in ECMWF-IFS are relatively insensitive to an increase in atmospheric resolution from ∼ 50 to ∼ 25 km. However, increasing the horizontal resolution of the atmosphere while maintaining the same vertical resolution enhances the magnitude of a cold bias in the lower stratosphere. In coupled configurations, there is a strong sensitivity to an increase in ocean model resolution from 1 to 0.25. However, this sensitivity to ocean resolution takes many years to fully manifest and is less apparent in the first year of integration. This result has implications for the ECMWF coupled model development strategy that typically relies on the analysis of biases in short (< 1 year) ensemble (re)forecast data sets. The impacts of increased ocean resolution are particularly evident in the North Atlantic and Arctic, where they are associated with an improved Atlantic meridional overturning circulation, increased meridional ocean heat transport, and more realistic sea-ice cover. In the tropical Pacific, increased ocean resolution is associated with improvements to the magnitude and asymmetry of El Niño–Southern Oscillation (ENSO) variability and better representation of non-linear sea surface temperature (SST)–radiation feedbacks during warm events. However, increased ocean model resolution also increases the magnitude of a warm bias in the Southern Ocean. Finally, there is tentative evidence that both ocean coupling and increased atmospheric resolution can improve teleconnections between tropical Pacific rainfall and geopotential height anomalies in the North Atlantic.


Introduction
The European Centre for Medium-Range Weather Forecasts (ECMWF) uses a global general circulation model known as the Integrated Forecasting System (IFS) to produce probabilistic ensemble forecasts at lead times of several days to 1 year ahead.Since the implementation of cycle 43r1 in November 2016, the operational version of the IFS used to make medium-range ensemble forecasts has included dynamic representations of the atmosphere, ocean, sea ice, and ocean waves (Buizza et al., 2017).
weather prediction (Rodwell and Palmer, 2007).Such errors are typically studied by examination of biases in (re)forecasts at different operational lead times, scrutiny of the analysis increments available from data assimilation systems, and evaluation of forecast reliability (Rodwell and Palmer, 2007;Palmer et al., 2008).
In order to identify errors associated with "slow" processes, it is necessary to run longer climate integrations.Such experiments can provide complementary information to that available from short-range forecasts and may provide additional constraints on the representation of physical processes that are important in forecast mode.For example, multi-decadal experiments enable the investigation of asymmetries and timescales of coupled ocean-atmosphere phenomena such as the El Niño-Southern Oscillation (ENSO) (see Sect. 4.1), which provides context for the interpretation of biases in seasonal forecasts.In addition, free-running climate integrations are an important tool for understanding the behaviour and deficiencies of multi-decadal reanalyses, particularly when they cover periods and/or regions with limited observational constraints (e.g.Hersbach et al., 2015;Poli et al., 2016).
The longest experiments performed routinely at ECMWF are 13-month integrations to assess the climatology of each new IFS cycle.However, there are efforts in the wider scientific community to evaluate the behaviour of the ECMWF model on longer timescales.In particular, the EC-Earth consortium (Hazeleger et al., 2010) developed and maintains a global climate model configuration that shares many components with the IFS.However, the latest version of the EC-Earth model has been developed starting from a previous IFS cycle that was operational several years ago, due to the different timescales involved in the preparation of new operational IFS cycles (∼ 6 months) and new EC-Earth configurations intended for climate applications (∼ 5 years).The operational and EC-Earth versions of the IFS also differ in their coupling strategy, tuning of model components, ocean and sea-ice model versions, and the treatment of ocean waves.Because of these distinctions, it is not always straightforward for lessons learned in EC-Earth experiments to be directly applied to the most recent operational IFS cycles.
In this paper, we describe climate model configurations of the IFS developed under the auspices of the European Research Council Horizon 2020 PRIMAVERA project (PRI-MAVERA website, 2017) that are built upon IFS cycle 43r1 and follow the protocols defined by the High Resolution Model Intercomparison Project (HighResMIP; Haarsma et al., 2016) and phase 6 of the Coupled Model Intercomparison Project (CMIP6; Eyring et al., 2016).Previous studies have shown that increases in ocean and atmospheric model resolution can affect many aspects of the climate system, including ocean and atmospheric biases (Roberts et al., 2009;Gent et al., 2010;Small et al., 2014), ENSO variability (Shaffrey et al., 2009;Delworth et al., 2012), ocean fronts and western boundary currents (Kirtman et al., 2012;Chas-signet and Xu, 2017), the representation of tropical cyclones (McClean et al., 2011), the nature of air-sea interaction at the ocean mesoscale (Bryan et al., 2010), and the global ocean heat budget (Griffies et al., 2015).Here, we present results from multi-decadal coupled and atmosphere-only experiments and provide an initial assessment of the impact of resolution and coupling in climate integrations that are traceable to a recent version of the ECMWF weather forecast model.
The rest of this paper is organized as follows: Sect. 2 describes the model configurations with a focus on the differences compared to the standard operational configuration of IFS cycle 43r1.Section 3 presents an initial assessment the major biases in the atmosphere, ocean, and cryosphere, and includes some comments on the impacts of coupling and resolution.Section 4 is focussed on the representation of ENSO variability and its associated asymmetries, non-linear oceanatmosphere feedbacks, and global teleconnections.Discussion and conclusions are presented in Sect. 5.

Model configuration
The physical submodels of the ECMWF forecast model (atmosphere, sea ice, ocean, land surface, ocean waves) and the infrastructure required to perform data assimilation and ensemble forecasts are collectively known as the IFS.Full documentation for the operational version of IFS cycle 43r1 is available online (ECMWF website, 2017).This section provides a brief summary of the main model components and details any differences relative to the operational configuration of IFS cycle 43r1.Following the nomenclature adopted within the PRIMAVERA project, we refer to the configuration presented here as ECMWF-IFS.Low-and high-resolution configurations are referred to as ECMWF-IFS-LR and ECMWF-IFS-HR, respectively, and experiments forced with observed sea surface temperature (SST) and sea-ice concentrations are identified with the suffix "A" (e.g.ECMWF-IFS-LRA).We also present selected results from a "mixed resolution" configuration that we term ECMWF-IFS-MR, which consists of the atmospheric model from ECMWF-IFS-LR coupled to the ocean model from ECMWF-IFS-HR.The computational cost of ECMWF-IFS is summarized in Table A1.
All configurations have 91 vertical levels in the atmosphere and use the cubic octahedral reduced Gaussian (Tco) grid such that the shortest wavelengths in spectral space are represented by four model grid points.ECMWF-IFS-LR uses the Tco199 grid, which has a grid-point resolution of ∼ 50 km, and an 1800 s time step.ECMWF-IFS-HR uses the Tco399 grid, which has a grid-point resolution of ∼ 25 km, and an 1200 s time step.The comparatively long time steps used in the IFS are enabled by the unconditionally stable semi-Lagrangian advection coupled with semiimplicit time stepping.Reducing the time step from 1200 to 900 s in the Tco399 configuration had a minimal impact on model biases and forecast skill on seasonal timescales (not shown).All computations for physical parameterizations are calculated on the atmospheric reduced Gaussian grid.Note that although we use the term "atmosphere-only" to identify model versions that are not coupled to a dynamic ocean, such configurations still retain two-way coupling with the landsurface and ocean-wave models.

Land-surface model
The land-surface component of IFS is the Hydrology Tiled ECMWF Scheme of Surface Exchanges over Land (HTES-SEL; Balsamo et al., 2009), with an updated snow scheme (Dutra et al., 2010) and carbon module (CHTESSEL; Boussetta et al., 2013b).Each model grid box is divided into nine "tiles": two vegetated fractions (high and low vegetation without snow), one bare soil fraction, three snow/ice fractions (snow on bare ground/low vegetation, high vegetation with snow beneath, and sea ice, respectively), and three water fractions (interception reservoir, ocean, lakes).Grid box surface fluxes are calculated separately for the different subgrid surface fractions leading to a separate solution of the surface energy balance equation and skin temperature for each of these tiles.
Vegetation types are derived from the Global Land Cover Characteristics data set (GLCC; Loveland et al., 2000), which provides a 1 km resolution classification of 20 landsurface types based on the Biosphere-Atmosphere Transfer Scheme (BATS; Dickinson et al., 1993).For each model grid box, the fractional coverage and vegetation model parameters are then set according to the dominant vegetation type within the GLCC data set.Seasonal variation in vegetation is included by prescribing a monthly leaf area index climatology based on MODIS satellite data (Boussetta et al., 2013a).The snow scheme is an energy-and mass-balance model that represents an additional layer on top of the upper soil layer, with independent prognostic thermal and mass contents (Dutra et al., 2010).

Wave model
The ocean-wave model is an evolved version of WAM (Komen et al., 1994) and solves a wave energy balance equation to determine the evolution of a two-dimensional wave spectrum that provides a statistical description of ocean waves over different frequencies and propagation directions (Janssen, 2004).The ocean-wave model interacts with the atmosphere and oceans through exchange of the Charnock parameter that determines the sea surface roughness (Janssen, 2004) and by introducing a sea-state-dependent modification of the turbulent kinetic energy flux used for ocean mixing (Janssen et al., 2013;Breivik et al., 2015).The ocean-wave model is run on an irregular latitude-longitude grid with a resolution of approximately 1 • in ECMWF-IFS-LR and 0.5 • in ECMWF-IFS-HR.

Ocean and sea-ice models
The ocean component of ECMWF-IFS is based on version 3.4 of the Nucleus for European Modelling of the Ocean (NEMO), which consists of a hydrostatic, finite-difference, primitive equation general circulation model (Madec, 2008) coupled to version 2 of the Louvain-la-Neuve Sea-Ice Model (LIM2; Bouillon et al., 2009;Fichefet and Maqueda, 1997).NEMO v3.4 uses an energy and enstrophy conserving vector form of momentum advection and has a linear free surface with semi-implicit time stepping for the hydrostatic pressure gradient solver.The vertical discretization is a z coordinate with 75 levels and partial cell thicknesses at the sea floor.The vertical mixing of tracers and momentum is parameterized using the turbulent kinetic energy (TKE) closure scheme (Gaspar et al., 1990).
ECMWF-IFS-HR uses the ECMWF operational configuration of NEMO, which is based on the "ORCA025" eddypermitting tripolar grid with a nominal resolution of ∼ 0.25 • .The NEMO configuration used in ECMWF-IFS-LR is not used operationally at ECMWF and is based on the "ORCA1" tripolar grid, which has a nominal horizontal resolution of ∼ 1 • and meridional refinement to ∼ 0.   and Mcwilliams (1990) parameterization yes no Isoneutral tracer diffusivity (rn_aht_0) 1000.0 m 2 s −1 300.0 m 2 s −1 Eddy-induced velocity coefficient (rn_aeiv_0) 1000.0 m 2 s −1 n/a NEMO with differences limited to resolution-dependent parameterizations (Table 1).Of particular interest is the Gent and Mcwilliams (1990) parameterization for the effect of mesoscale eddies, which is switched on in the ECMWF-IFS-LR configuration but switched off in ECMWF-IFS-HR.The LIM2 sea-ice model shares a horizontal grid with NEMO and is comprised of a three-layer thermodynamic model for the vertical conduction of heat (Fichefet and Maqueda, 1997) and a two-category (solid ice and leads) representation of dynamics based on the approach of Hibler (1979).Ice motion is driven by the ocean currents and surface wind stress, and we adopt the viscous-plastic rheology used in the operational implementation of LIM2 at ECMWF.For the computation of freshwater exchanges, sea ice is assumed to have a constant reference salinity of 6 psu.

Coupling
Energy, mass, momentum, and turbulent kinetic energy fluxes are exchanged between submodels with a coupling frequency of 1 h in all configurations.Coupling is achieved using the sequential single-executable strategy described by Mogensen et al. (2012).One coupled interaction that is not represented, due to its limited importance at operational timescales, is the link between precipitation over land and runoff into the ocean.To mitigate the impact of this missing process in multi-decadal simulations, a climatological estimate of riverine input is applied over coastal grid points, and the globally integrated freshwater input to the ocean is set to zero at each time step to prevent unwanted drifts in ocean salinity.
An important difference between the experiments described here and the operational configuration of IFS cycle 43r1 is the treatment of sea-ice coupling.In operational forecasts, prognostic sea-ice concentrations from LIM2 are coupled with the atmosphere model, but sea-ice skin temperatures are calculated using the IFS land-surface thermodynamics combined with an assumed ice thickness and an observationally derived sea-ice albedo climatology.Early experiments with a prototype version of ECMWF-IFS suffered from extreme biases in Arctic sea-ice volume.To alleviate, but unfortunately not resolve, these biases, we deviate from the operational configuration and couple skin temperatures from LIM2 and disable the IFS land-surface thermodynamics over sea ice.The IFS sea-ice albedo climatology was retained as coupling the prognostic albedo from LIM2 exacerbated existing biases.This result can be explained in part by the absence of melt pond processes in LIM2, which causes prognostic albedo values to be much higher than observed during the summer.

External climate forcings
The external radiative forcings used for transient historical experiments follow CMIP6 recommendations (CMIP6 website, 2017).Greenhouse gas concentrations are specified using the data set described in Meinshausen et al. (2017).Concentrations of CO 2 , CH 4 , N 2 O, and CFC 12 are explicitly prescribed, and the combined effect of other greenhouse gas species is included as an effective concentration of CFC 11 .Time-varying historical ozone concentrations are specified using monthly three-dimensional ozone distributions from the International Global Atmospheric Chemistry/Stratosphere-troposphere Processes And their Role in Climate Chemistry-Climate Model Initiative ozone database (IGAC/SPARC CCMI ozone data for CMIP6, 2017).Solar forcing is applied as annual means of total solar irradiance derived from the CMIP6 solar forcing described in Matthes et al. (2017).
Tropospheric aerosol forcing is specified using version 2 of the Max Planck Institute Aerosol Climatology Simple Plume model (MACv2-SP Stevens et al., 2017), which was implemented within the IFS radiation scheme specifically for this configuration.MACv2-SP enables the prescription of the anthropogenic aerosol optical properties and associated Twomey effect, and consists of nine spatial plumes related to the major anthropogenic sources.The amplitudes of each plume are scaled through time to provide regionspecific variations in tropospheric aerosol through the historical period.
Volcanic forcing is specified as a total stratospheric aerosol optical depth (SAOD) that varies with time and latitude.SAOD is derived from an offline vertical integration of extinction coefficients at 550 nm from a simplified version of the CMIP6 stratospheric aerosol data set v2 (Stratospheric aerosol data for CMIP6, 2017).Extinction coefficients below the observed climatological tropopause (we use the "dynamic" tropopause of Wilcox et al., 2012) do not contribute to estimates of SAOD.
In atmosphere-only experiments, daily mean SSTs and sea-ice concentrations are specified using a 0.25 × 0.25 • version of the observation-based HadISST2 data set (Rayner et al., 2016;Titchner and Rayner, 2014).The climate forcing from changes in land use is not prescribed.

Tuning
Approaches to climate model "tuning" vary depending on the intended scientific purpose of the model and core mission of the centre(s) responsible for model development (Hourdin et al., 2017;Schmidt et al., 2017).The aim of ECMWF-IFS was not to produce the best possible model for climate applications, but to provide a configuration that was traceable to a recent operational forecast model and also sufficiently stable to run multi-decadal experiments in coupled mode.Under this constraint, tuning was limited to several cases where leaving the model unchanged would have introduced intolerable drifts in the climate system.
The principal tuning target was global surface energy balance in atmosphere-only simulations over the period 2005-2013.Simulations were considered adequate if the global average heat flux into the ocean (relative to Earth's surface area) was within the observed range of 0.2-1.0W m −2 (Wild et al., 2013).It was found that net surface energy balance was strongly sensitive to atmospheric resolution such that the net surface heating in ECMWF-IFS-HRA was about 1 W m −2 less than in ECMWF-IFS-LRA.To account for this difference, it was necessary to tune the cloud properties in ECMWF-IFS-HR by reducing the autoconversion threshold for liquid precipitation over the ocean.The only other adjustment to the atmospheric model relative to the operational implementation of cycle 43r1 was to modify the non-orographic gravity wave drag in all configurations to improve the representation of the quasi-biennial oscillation.This change was made following recommendations from the ECMWF seasonal forecast team and will be implemented operationally in future IFS cycles.
Only one element of the system was tuned in coupled mode.Early prototype experiments suffered from a severe and monotonic trend in Arctic sea-ice volume that was not addressed by modifications to the air-ice-sea coupling.To stabilize this bias, it was necessary to scale the climatological sea-ice albedos used in calculations of sea-ice skin temperature by a constant factor of 0.95 across all spectral bands.This scaling was a pragmatic approach to partially mitigate the underlying errors in the downward radiation over the Arctic and the response of LIM2 sea-ice model.The correction is only applied to coupled integrations, and atmosphere-only integrations use the original IFS albedo climatology.

HighResMIP experiments
This section describes atmosphere-only and coupled experiments following the CMIP6 HighResMIP protocol (Haarsma et al., 2016) that were performed with ECMWF-IFS under the auspices of the PRIMAVERA project.A full list of experiments and their associated ECMWF run identifiers is provided in Table A2.

highresSST-present
The highresSST-present ensemble covers the period 1950-2014 and consists of six members from ECMWF-IFS-LRA and four members from ECMWF-IFS-HRA.These atmosphere-only integrations are forced with observed SSTs, observed sea-ice concentrations, and external radiative forcings as described in Sect.2.6.Atmosphere and land-surface models are initialized with conditions representative of 1 January 1950 using data from the ERA-20C reanalysis (Poli et al., 2016).There is no separate spin-up integration for the highresSST-present ensemble.This approach is acceptable for the majority of atmosphere and land-surface variables that adjust quickly (< 1 year) to the imposed forcings.One notable exception is the water content of the deepest soil layers (72-199 cm), which takes several years to reach a new equilibrium.Ensemble members are distinguished by different seeds to the SPPT scheme, where the seed determines the initial random two-dimensional fields that are used to perturb the parameterized model tendencies (Leutbecher et al., 2017).

spinup-1950
To provide initial conditions for the coupled experiments described below, 50-year spin-up integrations are performed with ECMWF-IFS-LR, ECMWF-IFS-MR, and ECMWF-IFS-HR using external forcings fixed at values from the year 1950.The ocean is initialized from rest using a temperature and salinity climatology representative of the 1950s derived from the EN4 objective analysis (Good et al., 2013).In regions with initial sea-surface temperatures below freezing, sea ice is initialized from rest with a uniform thickness of 3 m in the Arctic and 1 m in the Antarctic.The atmosphere and land-surface models are initialized in the same way as highresSST-present.The 50-year spin-up period follows the HighResMIP protocol and is sufficient for the near-surface ocean and sea ice to reach an approximate steady state.In contrast, the ocean interior takes many centuries to fully adjust and is still drifting at the end of the 50-year spin-up integration.

hist-1950
The coupled hist-1950 ensemble covers the period 1950-2014 and consists of six members from ECMWF-IFS-LR and four members from ECMWF-IFS-HR, and a single member from ECMWF-IFS-MR.Experiments are initialized from the end of spinup-1950, and time-varying external forcings are specified using the data sets described in Sect.2.6.As in highresSST-present, ensemble members are distinguished by different seeds to the atmospheric stochastic physics.Since some ocean properties have significant persistence on multi-annual timescales, members of hist-1950 are not completely independent for the first few years of the experiment.However, preliminary analysis of North Atlantic ocean heat content indicates that the ensemble diverges rapidly within the first few years with ensemble spread saturated after about 10-15 years.

control-1950
The control-1950 experiments consist of single-member 100year integrations initialized from the end of the associated spinup-1950 experiments with forcings fixed at 1950 values.These experiments run in parallel to hist-1950 and are useful to identify long-term trends that are unrelated to changes in radiative forcings.On shorter timescales, both coupled and atmosphere-only configurations accurately simulate the transient cooling over land associated with large volcanic eruptions, although the SST response to volcanic eruptions in ECMWF-IFS-LR and ECMWF-IFS-HR is generally too large compared to observations.Over the so-called "hiatus" period of the early 21st century, global SST anomalies in ECMWF-IFS-LR and ECMWF-IFS-HR increase faster than observed, particularly when compared to an earlier version of the HadISST data set (Rayner et al., 2003) rather than the HadISST2 forcing data set recommended by HighResMIP.The discrepancy in the rate of warming between models and observations during this period is also common in CMIP5 models and has been attributed to differences in the phase of observed and simulated modes of multi-decadal variability (e.g.Meehl et al., 2013;Roberts et al., 2015).

Planetary energy budget
Table 2 shows radiative and turbulent components of the global mean net surface heat flux (F sfc ) for the period 2000-2014 in coupled and atmosphere-only configurations.The individual components are well within observational uncertainty estimates (Wild et al., 2013) in all configurations.F sfc is in good agreement with observations in atmosphere-only simulations and ECMWF-IFS-HR, and slightly too high in ECMWF-IFS-LR.This offset in F sfc between experiments is a result of the time-evolving SST biases that develop in coupled integrations (see Sect. 3.3) combined with the short duration of the spin-up experiment (50 years) relative to the equilibration time of the ocean interior (hundreds of years).
Changes in global mean upper 700 m ocean temperature (T 0−700m ) from coupled experiments are shown in Fig. 1c against the range of estimates from the observation-based ORAS4 ensemble (Balmaseda et al., 2013).Consistent with the values of F sfc in Table 2, changes in T 0−700m are generally too large compared to observations, particularly in ECMWF-IFS-LR.The anomalously high rate of ocean heat uptake in the coupled experiments is associated with temperature "drift" in the ocean interior that occurs in response to the development of surface ocean biases.It is not practical to run high-resolution coupled experiments for the many hundreds of years required for the ocean interior to reach equilibrium with the imposed forcings.Instead, the impact of the time-evolving forcing on T 0−700m can be evaluated by expressing changes in hist-1950 with respect to changes in control-1950.Following this adjustment (Fig. 1c), estimates of T 0−700m in ECMWF-IFS-LR and ECMWF-IFS-HR are in good agreement with one another, despite the differences in long-term drift, and very close to the range of estimates from ORAS4.In addition, both adjusted and unadjusted estimates of T 0−700m contain realistic cooling signals associated with major volcanic eruptions that are also present in ORAS4.

Conservation in the atmosphere
Due to the limited heat capacity of the atmosphere, the net radiation flux at the top of the atmosphere (F toa ) should almost exactly balance F sfc on timescales of a year or more (Palmer and McNeall, 2014).Despite the favourable comparisons with various aspects of the global energy budget (Fig. 1a-c, Table 2), annual mean values of F sfc and F toa do not exactly balance one another in ECMWF-IFS, indicating a spurious source of energy in the atmosphere of approximately 1 W m −2 .The absence of closure for atmospheric energy budgets has been described previously for CMIP5 models (Hobbs et al., 2016) and an earlier version of the IFS (Hersbach et al., 2015).Provided such "energy leaks" are time-constant and independent of forcings, results can be interpreted in an anomaly framework without biasing results (Hobbs et al., 2016).
The magnitude of non-conservation in ECMWF-IFS is resolution dependent with values of about 1.2 and 1.0 W m −2 at Tco199 and Tco399, respectively.Further investigation revealed that the leading contribution to non-conservation was the non-conservation of moisture during the semi-Lagrangian advection.In additional test experiments, application of a mass-fixer algorithm that preserves the global integrated moisture mass before and after calls to the advection (Diamantakis and Flemming, 2014; Bermejo and Conde, 2002) significantly improved conservation of moisture and heat in the atmosphere and should be used in future climate experiments with ECMWF-IFS.Despite these issues, anomalies in F toa are sufficiently well correlated with anomalies in F sfc (correlations are 0.98 and 0.97 in ECMWF-IFS-LR and ECMWF-IFS-HR, respectively) that it is possible to interpret the resulting climate in a physically meaningful way.However, one important caveat is that it is necessary to consider F sfc rather than F toa when evaluating planetary energy balance.This is particularly important if tuning the energy balance in an atmosphere-only configuration for use in the coupled system as it is mostly through adjustments in SST, and hence changes in F sfc , that the climate system responds to an imposed forcing.

Near-surface temperature biases
The near-surface air temperature over land (T 2m ) in ECMWF-IFS is generally cooler than observed (Fig. 2), with all configurations exhibiting particularly severe cold biases in regions of high elevation and a localized warm bias in eastern Siberia.Increasing atmospheric resolution from 50 to 25 km in atmosphere-only experiments (Fig. 2a-b) has very little impact on T 2m biases.However, coupling to the NEMO ocean model increases the magnitude of the Northern Hemisphere cold bias (Fig. 2c-d) because of the introduction of a cold bias in North Atlantic SSTs that is particularly severe in ECMWF-IFS-LR (Fig. 3a, c).Despite performing better than ECMWF-IFS-LR in the Northern Hemisphere, ECMWF-IFS-HR suffers from an increased warm bias over the Australian continent due to a warm bias in the Southern Ocean that is intensified with the higher-resolution ocean (Fig. 3a, c).The origins of SST biases in coupled configurations are discussed in more detail in Sect.3.3.

Precipitation biases
Atmosphere-only configurations of ECMWF-IFS suffer from a characteristic pattern of excessive precipitation over the tropical oceans that is insensitive to changes in atmo-    spheric resolution (Fig. 4a-b).Although the zonal mean structure of tropical precipitation biases is similar in coupled and atmosphere-only configurations (Fig. 4e), the presence of spatially varying SST biases in the coupled system (Fig. 3a, c) results in several basin-specific differences in tropical precipitation relative to the atmosphere-only configurations.In particular, both coupled configurations suffer from an intensification of precipitation biases in the off-equatorial tropical Pacific and western tropical Indian Oceans and a reduction of precipitation over the equatorial Pacific.In the case of ECMWF-IFS-LR, an equatorial Pacific cold SST bias (Fig. 3a, c) results in a precipitation deficit.These differences are indicative of the impact of coupling on the large-scale circulation of the tropical atmosphere.In both ECMWF-IFS-LR and ECMWF-IFS-HR, the cold SST bias in the North Atlantic (Fig. 3a, c) drives a southward shift of the Atlantic Intertropical Convergence Zone (ITCZ) and reduced precipitation over the southern edge of west Africa.This bias is larger in ECMWF-IFS-LR reflecting the larger bias in North Atlantic SSTs.
The off-equatorial Pacific rain bands are a typical feature of coupled climate models and well known as the "double- ITCZ problem" (e.g.Li and Xie, 2014;Adam et al., 2017).The double-ITCZ bias in ECMWF-IFS-LR and ECMWF-IFS-HR is largely confined to the west Pacific and is generally an improvement on the previous generation of CMIP models where the southern rain band is much stronger (see Fig. 1a of Adam et al., 2016).Following Adam et al. (2016), we calculate the tropical precipitation asymmetry index A p , defined as where overbars denote the zonal average over the Pacific sector and subscripts indicate the area-weighted meridional average between the specified latitudes.The observation-based value of A p is 0.19 and can be compared to values of 0.11 in ECMWF-IFS-LR and 0.15 in ECMWF-IFS-HR, the latter of which is better than 26 of the 28 CMIP models compared in Adam et al. (2017).

Cloud radiative forcing biases
To further understand the biases in T 2m , we consider biases in the surface radiation budget that result from errors in cloud  radiative forcing (CRF; Figs.5-7).CRF is given by the following equation: where F is a downward radiative flux at the surface and "clear sky" indicates the radiative fluxes that would be received in the absence of clouds.Spatial variations in the sign of the total CRF bias largely reflect errors in short-wave CRF (Fig. 6) that are partially compensated by opposing errors in long-wave CRF (Fig. 7).Total CRF biases over the ocean are generally negative, except in the Southern Ocean and regions of upwelling.CRF biases over land have more small-scale structure due to variations in topography and land-surface characteristics.The large-scale spatial structure of CRF biases is set by the atmospheric model and is relatively in- Over land, there is no simple causal relationship between T 2m biases and total CRF biases (Fig. 5).For example, all model configurations exhibit a strong negative T 2m bias over the Tibetan Plateau that occurs despite positive biases in short-wave CRF and total CRF (Fig. 6).To better understand such discrepancies, we also consider the impact of biases in surface albedo on the net short-wave radiation at the surface (Figs. 8 and 9).Over the oceans, biases in surface albedo are a consequence of biases in sea-ice cover and/or its assumed albedo.For example, the loss of Antarctic sea ice in coupled configurations results in a strong reduction of surface albedo in the Southern Ocean (Fig. 8c-d) and an excess of absorbed short-wave radiation in this region (Fig. 9c-d).Over land, biases in surface albedo reflect biases in snow cover and/or inaccurate specification of the land-surface characteristics.In some regions over land (e.g.northern Africa, coastal Greenland, northern Siberia, and the Tibetan Plateau), the cold biases in T 2m can be explained by positive biases in surface albedo that result in insufficient net surface short-wave radiation.In other locations, near-surface cold biases are associated with positive biases in net surface short-wave radiation, suggesting a dynamic origin for such temperature errors.This effect is most pronounced for the Andes, which could be indicative of errors in the response of the atmospheric circulation to steep orography.

Upper-atmosphere biases
In ECMWF-IFS-LRA and ECMWF-IFS-HRA, lowertroposphere temperature biases (Fig. 10) are relatively small.In ECMWF-IFS-LR and ECMWF-IFS-HR, lowertroposphere temperature biases, and associated changes to the westerly winds, reflect SST biases in the North Atlantic and the Southern Ocean (Fig. 3a, c).All configurations suffer from a prominent cold bias in the lower stratosphere that is increased in magnitude at higher horizontal resolution.This bias is not yet fully understood but is thought to be a consequence of spurious mixing across the tropopause associated with small-scale variability that is intensified at horizontal higher resolutions with the cubic octahedral grid and improved by increased vertical resolution in the atmosphere (personal communication, Tim Stockdale).In the upper stratosphere, all configurations exhibit a warm bias that is maximal in the midlatitudes of the winter hemisphere (Fig. 10) and relatively insensitive to changes in atmospheric resolution.
Zonal mean westerly wind biases are dominated by errors in the position and/or intensity of jets in the upper troposphere and stratosphere (Fig. 11), and reflect the meridional structure of temperature biases (Fig. 10).The largest bias in westerly winds is common to all configurations and is located in the tropical stratosphere between 5 and 50 hPa.This bias has been identified in previous versions of the IFS and is known to be sensitive to the details of vertical diffusion in the atmosphere (Hersbach et al., 2015).In accordance with the thermal-wind relationship, the vertical shear and absolute magnitude of this bias is slightly higher in ECMWF-IFS-HR and ECMWF-IFS-HRA because of the larger-magnitude temperature bias in the lower stratosphere.In ECMWF-IFS-LR and ECMWF-IFS-HR, SST biases drive a northward shift of surface winds in the region 60-30 • S and an intensification of the Southern Hemisphere polar stratospheric vortex.This effect is most prominent in ECMWF-IFS-HR due to the larger-magnitude Southern Ocean warm bias.

Biases in the ocean and cryosphere
This section presents an overview of the climatological biases in the ocean and cryosphere in coupled configurations of ECMWF-IFS, including selected results from the "mixed resolution" configuration, ECMWF-IFS-MR, which combines the atmosphere from ECMWF-IFS-LR with the highresolution ocean from ECMWF-IFS-HR.A number of issues are common to all three coupled configurations, including negative SST biases in the North Atlantic, positive SST biases in the Southern Ocean, excessive Arctic sea ice, positive SSS biases in the Arctic, negative SSS biases in the tropical South Pacific associated with the double ITCZ, insufficient mixing around ∼ 50 • S in the Southern Ocean, and inadequate heat loss over the Gulf Stream extension (Fig. 3).It is also clear from Fig. 3 that differences in the mean state of the coupled system are dominated by the change in ocean resolution.Indeed, the ocean biases in ECMWF-IFS-MR and ECMWF-IFS-HR are nearly identical despite the doubling of atmospheric resolution from ∼ 50 to ∼ 25 km (Fig. 3).
With an obvious exception of the Southern Ocean, many biases show a clear improvement with increased horizontal resolution in the ocean.One of the most important contributors to these improvements is the better representation of meridional heat transport in the higher-resolution ocean model.Figure 12a-b show total meridional heat transport decomposed into contributions from the mean flow (i.e.V • T ) and transient "eddies" (i.e.V • T ), where overbars indicate a time mean and primes indicate deviations from a time mean.Note that the parameterized eddy-induced velocity is included in V for ECMWF-IFS-LR.From these plots, it is evident that increasing ocean model resolution can have a large impact on heat transport associated with both the mean flow and the transient eddies.For example, the equatorial Pacific and North Atlantic are two key locations where the timevarying "eddy" heat transport is better resolved in ECMWF-IFS-MR/-HR (Fig. 12a-b) leading to changes in the ocean heat budget and improved SSTs (Fig. 3a-c).
Some ocean biases are common to all three coupled configurations suggesting either a structural error that is present in both high-and low-resolution versions of NEMO or common errors in the forcing from the atmospheric model.For example, the negative mixed layer bias in the Southern Ocean is common to all three models despite rather different biases in SST and wind stress (not shown).This suggests a deficiency in the TKE scheme for vertical mixing in the ocean that is common to all three coupled models.In addition, all   coupled configurations exhibit positive SST biases along the western coastlines of the South American and African continents associated with a positive bias in short-wave CRF that originates in the atmospheric model (Fig. 6).The magnitude of these biases shows some sensitivity to ocean model resolution, which could be related to the representation of coastal upwelling that is known to be important for the development of SST biases in such regions (Small et al., 2015).
Given that the ECMWF coupled model is used principally for operational applications with lead times much shorter than 1 year, it is also interesting to consider the timescale dependence of ocean biases.In short (< 1 year) integrations, coupled model biases reflect the influence of "fast" processes combined with impact of the ocean initialization strategy.On decadal and longer timescales, coupled model biases include the impacts of "slower" coupled processes and are more representative of the asymptotic behaviour of the model.Some elements of the coupled system respond relatively quickly such that biases evident in the first year (Fig. 13) are remarkably similar to biases calculated using 50 years of data (Fig. 3).For instance, the main features of the mixed layer depth biases in the Southern Ocean are established within the first year of integration, highlighting the dominant role of "fast" dynamical processes.This rapid response can be contrasted with the more slowly developing SST and SSH biases in the same location, which take longer to develop because they represent a thermodynamic response with timescales governed by the heat capacity of the upper ocean.In the North Atlantic, despite the fundamental differences in the representation of the underlying ocean dynamics, it is difficult to discriminate between SST and surface heat flux biases in high-and low-resolution configurations using only a single year of data (Fig. 13).These comparisons emphasize that multi-decadal climate experiments can provide information for the coupled model development process that is complementary to that available from short integrations and (re)forecast data sets.

North Atlantic
The negative North Atlantic SST bias is present in all coupled configurations but is particularly severe in ECMWF-IFS-LR.This bias is a consequence of inadequate northward ocean heat transport (Fig. 12a) and a sluggish Atlantic meridional overturning circulation (AMOC; Fig. 12b).As noted above, Atlantic ocean heat transport is higher in ECMWF-IFS-MR and ECMWF-IFS-HR but still lower than estimates derived from hydrographic sections (Ganachaud and Wunsch, 2003).In contrast, the northward heat transport in the Indo-Pacific is similar in all coupled configurations and in reasonable agreement with observational estimates (Fig. 12b).
Comparison of simulated AMOC stream function profiles with observations from the RAPID-MOCHA array at 26 • N  (McCarthy et al., 2015) demonstrates that all configurations underestimate the maximum strength of the AMOC and the depth of the southward return flow, although ECMWF-IFS-MR/-HR performs substantially better than ECMWF-IFS-LR (Fig. 12c).Previous work has demonstrated that the agreement between observed and simulated stream functions at 26 • N can be improved by accounting for a sensitivity to the choice of reference level used in the geostrophic approximation employed in the RAPID-MOCHA observations (Roberts et al., 2013).However, even when this sensitivity is taken into account, all models exhibit an overly shallow AMOC (Fig. 12c).This common bias in the depth structure of the AMOC occurs in spite of contrasting biases in the depth of convection in the Labrador Sea (compare Fig. 3g-i) and is likely a consequence of issues in the representation of the density-driven Nordic Sea overflows that are known to affect z-coordinate ocean models (Danabasoglu et al., 2010).
Figure 12d shows variations in the strength of the AMOC at 26 • N against variations in overturning and horizontal gyre contributions to ocean heat transport (see Johns et al., 2011, for details of this decomposition).From this plot, it is clear that most of the differences in heat transport between coupled models and observations can be explained by differences in the strength of the overturning circulation.In addition, the model heat transport for the same AMOC strength is offset relative to the observations by about ∼ 0.2 PW.This is due to inaccuracies in the distribution of volume trans- port in temperature classes due to biases in the depth structure of the overturning and the near-surface temperature biases.ECMWF-IFS-LR also exhibits a spurious heat transport compensation such that increased heat transport by the overturning circulation is offset by a reduced heat transport by the gyre.This relationship is not evident in the RAPID-MOCHA observations or ECMWF-IFS-MR/-HR and is likely related to the absence of the Florida Straits, and its associated depthconfined flow, in the ORCA1 NEMO model.

Southern Ocean
The positive SST bias in the Southern Ocean (Fig. 3ac) is most evident in ECMWF-IFS-MR/-HR and is associ-ated with reduced Antarctic sea-ice cover (Fig. 14), a shallow bias in the depth of the ocean mixed layer (Fig. 3gi), and increased sea surface height (SSH) around Antarctica (Fig. 3m-o).In ECMWF-IFS-MR/-HR, these biases in the surface ocean are associated with a weak Antarctic Circumpolar Current (ACC).For example, mean volume transport through the Drake Passage is ∼ 60 Sverdrups (1 Sv = 10 6 m 3 s −1 ) in ECMWF-IFS-MR/-HR, compared to ∼ 115 Sv in ECMWF-IFS-LR and ∼ 130 Sv from observations (Cunningham et al., 2003;Whitworth III, 1983).This reduced transport through the Drake Passage in ECMWF-IFS-MR/-HR is not a simple geostrophic response to the reduced latitudinal gradients in SSH and ocean temperature.It  Ganachaud and Wunsch (2003).Total heat transport (solid lines) is decomposed into contributions from the mean flow (V • T ; dashed lines) and time-varying "eddies" (V • T ; dotted lines).(c) Stream functions of AMOC at 26.5 • N calculated using years 1-50 of spinup-1950 compared to observations from RAPID-MOCHA array for the period 2004-2015(McCarthy et al., 2015)).Dashed lines correspond to model profiles calculated using the RapidMoc Python tool (Roberts, 2017) with a geostrophic reference depth of 4750 m.(d) Strength of AMOC at 26.5 • N against total (circles), overturning (plus signs), and horizontal gyre (triangles) components of the meridional heat transport in spinup-1950.Observed heat transport components are from the RAPID-MOCHA array (Johns et al., 2011), and symbols correspond to annual means.
is instead related to the development of a westward flow of ∼ 50 Sv that occurs between 64 and 60 • S, which counters the expected eastward transport of ∼ 120 Sv that occurs between 60 and 56 • S.
One of the drivers of the Southern Ocean SST bias is a 10-20 W m −2 bias in short-wave cloud radiative forcing that is insensitive to resolution and present in both coupled and atmosphere-only configurations (Fig. 6).This bias in downward short-wave radiation at the surface is reinforced by a coupled sea-ice albedo feedback that is particularly strong in ECMWF-IFS-MR and ECMWF-IFS-HR (Figs. 8 and 9).This result is consistent with previous studies that have documented the tendency for atmospheric models to underestimate the albedo of clouds over the Southern Ocean (Bodas-Salcedo et al., 2012, 2014) and the subsequent impact on Southern Ocean SSTs when coupling to the NEMO ocean model (e.g.Williams et al., 2015).
The different responses in the Southern Ocean in ECMWF-IFS-LR and ECMWF-IFS-MR/-HR are likely a consequence of the different representations of the ocean mesoscale.Specifically, the balance between wind-driven and baroclinic-instability-driven circulations is parameterized in ECMWF-IFS-LR but not in ECMWF-IFS-MR/-HR.Although the ORCA025 version of the NEMO ocean model used in ECMWF-IFS-MR/-HR is considered "eddypermitting", it is far from fully resolving the energetic mesoscale eddy field that extends along the path of the ACC (Marshall and Speer, 2012).Recent work by Hewitt et al. (2016) found that comparable Southern Ocean biases in the HadGEM3 climate model were reduced when increasing the horizontal resolution of NEMO from 1/4 • to 1/12 • .

Sea-ice biases
Simulated and observation-based climatologies of sea-ice area and volume are shown in Fig. 14a-d.In the Northern Hemisphere, ECMWF-IFS-MR/-HR effectively captures the seasonal cycle of sea-ice area, though the extent is too high during the September minimum.In contrast, ECMWF-IFS-LR suffers from an excess of Northern Hemisphere sea ice in all seasons.During the maximum extent, the bias in ECMWF-IFS-LR extends over the Labrador Sea and parts of the western North Atlantic subpolar gyre leading to suppressed ocean convection and a positive feedback on the AMOC and North Atlantic SST biases.The increased transport of freshwater by sea ice also reduces the salinity of the North Atlantic subpolar gyre, which further contributes to the reduced AMOC in ECMWF-IFS-LR.
Northern Hemisphere sea ice is too thick in all three coupled configurations.The resulting biases in ice volume are most severe in ECMWF-IFS-LR (Fig. 14b) due to the combined effect of biases in extent and thickness.In part, these biases can be explained by excessive ice growth in response to negative biases in both long-wave and short-wave CRF over the Arctic (Figs. 6 and 7).
In the Southern Hemisphere, ECMWF-IFS-MR/-HR has too little sea ice during the September maximum and all three coupled configurations have too little sea ice during the February minimum.These deficiencies in Southern Hemisphere sea-ice extent are a consequence of the previously discussed biases in downward short-wave radiation and the resulting warm bias in Southern Ocean SSTs that is particularly extreme in ECMWF-IFS-MR/-HR.Fetterer et al., 2017).Arctic ice volumes are compared with estimates from the Pan-Arctic Ice Ocean Modeling and Assimilation System (PIOMAS; Schweiger et al., 2011;Zhang and Rothrock, 2003).

ENSO variability and teleconnections
ENSO is the leading mode of tropical ocean-atmosphere variability (McPhaden et al., 2006).Tropical SST anomalies associated with ENSO can also influence the extratropical atmosphere through their impact on tropical convection and patterns of upper-atmosphere convergence/divergence that act as a source of Rossby waves, which subsequently propagate and dissipate at higher latitudes (Hoskins and Karoly, 1981;Trenberth et al., 1998;Simmons et al., 1983;Held et al., 1989).Accordingly, models such as the IFS that are used to make seasonal forecasts need to accurately represent the ocean-atmosphere interactions that drive ENSO variability and the teleconnections that communicate its influence around the globe (Latif et al., 1998).The following section provides an assessment of ENSO variability in ECMWF-IFS and includes discussion of the important ocean-atmosphere feedbacks and global teleconnection patterns.

Variability of ENSO indices
Figure 15 shows times series of Niño3.4SST anomalies from HadISST2 compared with four members of hist-1950 from each of ECMWF-IFS-LR and ECMWF-IFS-HR.El Niño and La Niña events are highlighted in red and blue, respectively, along with the total number of events in the period 1950-2014.When all ensemble members and experiments are considered together, the variance of detrended monthly Niño3.4SST values is higher in ECMWF-IFS-HR (0.84 K 2 ) than in ECMWF-IFS-LR (0.68 K 2 ) and in better agreement with observed values (0.78 K 2 ).
Previous generations of coupled climate models have struggled to simulate the asymmetry in ENSO variability that is characterized by shorter, more intense El Niño events and longer, less intense La Niña events (Zhang and Sun, 2014).This asymmetry is particularly evident in the eastern Pacific where it manifests as a positive skewness in the Niño1+2 SST index (Burgers and Stephenson, 1999).ECMWF-IFS-HR and ECMWF-IFS-LR both correctly simulate the sign of ENSO asymmetry in the Niño1+2 SST index (Fig. 16a).The magnitude of the asymmetry is generally better in ECMWF-IFS-HR (skewness of 0.53) than in ECMWF-IFS-LR (skewness of 0.31) compared to observations (skewness of 1.46), though both models miss the most extreme positive SST events in the Niño1+2 region (Fig. 16a) To further characterize the periodicity and asymmetry of ENSO variability, we calculate the frequency of warm and cool events as a function of duration (Fig. 16b-c).Consistent with our conclusions from Fig. 16a, the distributions of warm and cool event durations are more symmetric in ECMWF-IFS-LR and ECMWF-IFS-HR than observed.For warm events, there is a bias towards too many short (< 6 months) events and not enough events lasting 18-24 months.For cool events, both coupled models underestimate the number of events lasting 36-42 months.We also see a small number of simulated events that last longer than any observed event (> 42 months), which could be an artefact of the limited length of the observational record.Given these biases, we might expect that seasonal forecasts performed with coupled versions of the IFS to have a tendency to underestimate the persistence of longer-lasting warm and cool events in the tropical Pacific.

ENSO ocean-atmosphere feedbacks
For models to accurately simulate the observed asymmetry of ENSO variability, they must adequately capture the non-linear aspects of ENSO dynamics and thermodynamics.However, previous work has demonstrated that many coupled climate models struggle to simulate non-linearities in SST-radiation feedbacks as diagnosed by the relationship between the Niño3.4SST index and top-of-atmosphere radiation fluxes over the eastern equatorial Pacific (Bellenger et al., 2014;Mayer et al., 2016).These non-linearities are evident from observations during warm events as compensating changes in outgoing long-wave radiation (OLR) and absorbed solar radiation (ASR) (Fig. 17a-b) associated with the eastward extension of positive SST anomalies and the onset of deep convection in the eastern Pacific.Previous studies have identified a wide range of model behaviours in this region, which can be related to the differences in the mean climate of the eastern equatorial Pacific and the associated SST and cloud biases (Bellenger et al., 2014;Mayer et al., 2016).
Atmosphere-only configurations of ECMWF-IFS (Fig. 17c-f) capture this non-linear behaviour, but the compensation between OLR and ASR components is only partial and the negative OLR response to warm events is larger than observed by satellite radiation data (black crosses in Fig. 17a) and more similar to the OLR response in the ERA-Interim reanalysis (grey crosses in Fig. 17a).The degree of non-linearity shows no sensitivity to the change in atmospheric horizontal resolution from 50 to 25 km.These differences indicate a deficiency in the atmospheric model response to observed SST variability, which is likely related to the radiative impact of convective clouds in the eastern equatorial Pacific.
In the coupled configurations (Fig. 17g-j), ECMWF-IFS-LR shows limited evidence for non-linear SSTradiation interactions, whereas the OLR and ASR responses in ECMWF-IFS-HR are more similar to those in the atmosphere-only configurations.Following the arguments of Bellenger et al. (2014) and Mayer et al. (2016), we conclude that the improved ASR and OLR response in ECMWF-IFS-HR is a consequence of the increased ocean horizontal resolution and improvements in the mean state of the equatorial Pacific (see Sect. 3.3).The better representation of nonlinear ocean-atmosphere feedbacks in this region may also partly explain the improved asymmetry of ENSO variability in ECMWF-IFS-HR compared to ECMWF-IFS-LR.

ENSO teleconnections
The local and remote SST response to variations in northern winter (DJF) SST anomalies in the Niño4 region are shown in Fig. 18.The DJF period is chosen because tropical forcing of the extratropics is generally strongest in the northern winter (Trenberth et al., 1998).Both ECMWF-IFS-LR and ECMWF-IFS-HR capture signals in the central and eastern tropical Pacific and in the Indian Ocean.However, neither model is able to capture the negative correlations north and south of the tropical Pacific that are present in the observations.In ECMWF-IFS-LR, positive correlations extend too far west into the west Pacific warm pool and there is a spuriously strong correlation with SSTs in the North Atlantic subpolar gyre, which is likely a consequence of the strong cold bias and excessive wintertime sea ice in this region in ECMWF-IFS-LR.
Figure 19 shows covariance maps for global rainfall anomalies associated with DJF rainfall over the Niño4 region for both coupled and atmosphere-only experiments.The atmosphere-only configurations successfully capture the pattern of rainfall anomalies over the tropical Pacific, tropical Atlantic, maritime continent, and equatorial South America.However, neither atmosphere-only configuration is able to capture the magnitude of the precipitation responses over the United States, subtropical North Atlantic, and southern half of South America.In addition, both configurations overestimate the magnitude of precipitation anomalies over the Indian Ocean, suggesting a common deficiency in the response of the Walker circulation to diabatic forcing in the Niño4 region.Precipitation responses in the coupled configurations behave similarly to the atmosphere-only counterparts.The exception is the clear degradation in the equatorial Pacific in ECMWF-IFS-LR that is related to the previously described issues with the double ITCZ and equatorial cold tongue bias.
To examine the Northern Hemisphere wintertime response to diabatic forcing in the tropical Pacific, we calculate covariances of DJF geopotential height anomaly at 500 hPa (Z500) with the normalized rainfall anomaly in the Niño4 region (Fig. 20).In the ERA-Interim reanalysis, increased precipitation in the central equatorial Pacific is generally associated with positive Z500 anomalies over Greenland and Canada, and negative Z500 anomalies over Siberia, the northeast Pacific, and the North Atlantic.This pattern projects onto the negative phase of the North Atlantic Oscillation (NAO) and the positive phase of the Pacific/North American Pattern (PNA), though these modes explain only part of the ENSO response (Straus and Shukla, 2002).
The different configurations of ECMWF-IFS show some similarities in the extratropical response to Niño4 precipitation variability, including successful simulation of the PNA component of the response and failure to capture the Siberian low.In contrast, there are clear differences between configurations in the North Atlantic/European sector.In this region, ECMWF-IFS-HR and ECMWF-IFS-HRA more accurately  simulate the structure of the Z500 response to Niño4 precipitation, although the magnitude of the response is still too weak.Only in the coupled version of ECMWF-IFS-HR do Z500 anomalies project onto the negative phase of the NAO.Taken at face value, these results suggest that increased atmospheric resolution and coupling with a dynamic ocean may improve the response of extratropical circulation to tropical forcing in the North Atlantic sector.However, recent work has emphasized the difficulty in evaluating ENSO teleconnection patterns due to the limited sample size in both ob-servational and model data sets (Deser et al., 2017).The atmospheric variability over the North Atlantic on subseasonal to decadal scales is also affected by teleconnections originating from the tropical Indian Ocean (Cassou, 2008;Molteni et al., 2015), and a preliminary assessment of these teleconnections in ECMWF-IFS-LR and ECMWF-IFS-HR suggests a less optimistic picture.A more robust attribution analysis is beyond the scope of this paper, and will require careful consideration of both observational uncertainties and the re-

Impacts of coupling
Coupling to the ocean allows the simulation of coupled ocean-atmosphere phenomena, such as ENSO, at the expense of biases in SST and sea-ice distribution, which are the interface for interactions between the ocean and atmosphere.The biases introduced by coupling ECMWF-IFS to the NEMO ocean model are resolution dependent (see Sect. 3.3) but share some general features including cooling of the Northern Hemisphere, warming of the Southern Ocean (Fig. 3), and development of a double-ITCZ bias in tropical precipitation (Fig. 4).In the atmosphere, the impacts of ocean coupling are generally constrained to the troposphere, with stratospheric biases nearly identical in coupled and atmosphere-only integrations (Figs.anomalies in the North Atlantic, but the significance of this change has not yet been assessed (Fig. 20).

Impact of atmospheric resolution
The climatological surface biases in ECMWF-IFS are relatively insensitive to an increase in atmospheric resolution from ∼ 50 to ∼ 25 km .The limited response to a change in atmospheric resolution is not unique to the ECMWF model.For example, Bacmeister et al. (2014) found that the mean climate of the Community Atmosphere Model did not dramatically improve when increasing atmospheric model resolution from ∼ 100 to ∼ 25 km.The most obvious impact of the increase in atmospheric resolution in ECMWF-IFS is an increase in the magnitude of a prominent cold bias in the lower stratosphere and an associated degradation of the equatorial jet (Figs. 10 and 11).Other processes that are affected by the change in atmospheric resolution include the conservation characteristics of the semi-Lagrangian advection scheme and the net planetary energy balance.Increased atmospheric resolution also seems to improve teleconnections between tropical Pacific rainfall and geopotential height anomalies in the North Atlantic, but further work is required to assess the robustness of this result (Fig. 20).A more thorough assessment of the impact of atmospheric resolution on variability and extremes, which we expect to be more sensitive than mean biases, will be undertaken as part of the PRI-MAVERA project.

Impact of ocean resolution
The asymptotic surface biases in coupled configurations of ECMWF-IFS are dominated by the choice of ocean model resolution (Fig. 3).However, the timescale required to detect the impact of ocean resolution depends on the process and region being considered (Fig. 13).The positive impacts of an increase in ocean resolution from ∼ 100 to ∼ 25 km are particularly evident in the North Atlantic and Arctic, and include an improved Atlantic meridional overturning circulation, increased meridional ocean heat transport, and more realistic sea-ice cover.In addition, increasing ocean resolution improves the mean state of the equatorial Pacific, leading to improvements to the magnitude and asymmetry of ENSO variability and better representation of non-linear SST-radiation feedbacks.The negative effects of increased ocean resolution include an amplification of a Southern Ocean warm bias, weakening of the Antarctic circumpolar current, and the dramatic decline of Antarctic sea ice.These effects are likely a consequence of the "eddy-permitting" rather than "eddy-resolving" nature of the ORCA025 ocean configuration in the Southern Ocean and the disabling of the Gent and Mcwilliams (1990) parameterization.
To give some context to the biases in ECMWF-IFS-LR, ECMWF-IFS-MR, and ECMWF-IFS-HR, we compare the range of zonal mean SST biases from 34 CMIP5 models (Fig. 21).It is clear from this comparison that ECMWF-IFS-LR, which has no operational analog, is compromised by the cold bias in the North Atlantic such that Northern Hemisphere SST biases are similar to those in the worst performing CMIP5 models.In contrast, zonal mean SST biases in ECMWF-IFS-MR and ECMWF-IFS-HR, which are close to the operational configuration of the ECMWF seasonal forecast model, are comparable to the best-performing CMIP5 models and very similar to the CMIP5 multi-model mean at latitudes north of 40 • S.

The value of multi-decadal integrations
The experiments performed for this study are atypical for ECMWF.However, multi-decadal climate runs are an important ingredient for a seamless approach to Earth system model development for weather and climate forecasting.For example, from the comparisons in Sect.3.3 and additional sensitivity experiments with the ECMWF seasonal forecast system (not shown), it is clear that the impact of ocean resolution is timescale dependent.This means that biases identified from initialized (re)forecasts are not always a reliable indicator of the asymptotic behaviour of the coupled model, and in fact may depend more on the quality of the initial conditions provided by reanalyses.Furthermore, the increased magnitude of biases in multi-decadal simulations enables the identification of systematic model errors in the presence of internal variability that would otherwise require large ensembles to detect in a short-range forecast.
The asymptotic behaviour of coupled models is also becoming more important to the numerical weather forecasting community with the development of coupled approaches to Earth system reanalysis.The climatological attractor of the coupled model is especially important in such reanalyses because of its influence as a background field for periods and/or regions with limited observational constraints.
The ECMWF-IFS configuration presented here will serve as a platform for further research and sensitivity experiments as part of the HighResMIP and PRIMAVERA projects.The scientific and technical developments required for this study will also have a legacy at ECMWF as multi-decadal coupled climate experiments become more important for the development of Earth system models used for weather forecasting and reanalysis.
Code and data availability.The model configurations described here are based on the ECMWF IFS and the NEMO/LIM oceansea ice model.The IFS source code is available subject to a license agreement with ECMWF.ECMWF member-state weather services and their approved partners will be granted access.The IFS code without modules for data assimilation is also available for educational and academic purposes as part of the OpenIFS project (https: //software.ecmwf.int/wiki/display/OIFS/OpenIFS+Home,last access: 30 August 2018 Observed long-term trends in global 2 m temperature over land(T 2m  ) and global sea surface temperature (SST) are well-reproduced in ECMWF-IFS (Fig.1a-b) with only minor differences between low-and high-resolution configurations.The global surface temperature response in ECMWF-IFS-LRA and ECMWF-IFS-HRA is tightly constrained by the prescription of observed SSTs.In contrast, ECMWF-IFS-LR and ECMWF-IFS-HR have no such constraint and no attempt was made to tune the model to reproduce historical changes in global surface temperature.The long-term trends in global surface temperature are thus an emergent property of the imposed radiative forcings and coupled model, and the agreement with observations gives us confidence in the utility of ECMWF-IFS for multi-decadal climate applications.

Figure 1 .
Figure 1.(a) Global mean 2 m temperature over land (T 2m ) from ensemble means of the hist-1950 and highresSST-present experiments and the CRUTEM4 observational data set (Jones et al., 2012; Osborn and Jones, 2014).(b) Global mean sea surface temperature (SST) from ensemble means of the hist-1950 experiment and the Hadley Centre Sea Ice and Sea Surface Temperature (HadISST) (Rayner et al., 2003) and HadISST2 observational data sets (Rayner et al., 2016; Titchner and Rayner, 2014).(c) Volume-averaged ocean temperature (0-700 m) change relative to the first year of data in individual ensemble members of the hist-1950 experiment and the ORAS4 ensemble of ocean reanalyses (Balmaseda et al., 2013).Dashed lines are estimates of temperature change in hist-1950 after adjusting for long-term drift in the ocean by calculating temperature change relative to the corresponding control-1950 experiments.In all plots, periods corresponding to the Mount Pinatubo (1991), El Chichón (1982), and Mount Agung (1963) volcanic eruptions are shaded in yellow.All data points represent annual mean values.

a
Includes a flux of approximately −0.9 W m −2 associated with the latent heat of fusion in snowfall.

Figure 2 .
Figure 2. Annual mean bias in 2 m temperature over land ( • C) in (a-b) highresSST-present and (c-d) hist-1950 relative to the Climate Research Unit time series 4.01 data set (CRU TS; Harris et al., 2014) for the period 1981-2010.

Figure 3 .
Figure 3. Climatological surface ocean biases calculated using years 1-50 of the spinup-1950 experiment.(a-c) Sea surface temperature (SST) and (d-f) sea surface salinity (SSS) biases relative to the EN4 climatology representative of the period 1950-1954 that was used to initialize the ocean (Good et al., 2013).(g-i) Annual mean mixed layer depth (MLD) biases relative to the Holte et al. (2017) Argo climatology.(j-l) Net surface heat flux biases (positive downwards) relative to the net heat flux derived from radiative fluxes at the top of the atmosphere combined with the mass-adjusted energy divergence in the ERA-Interim reanalysis(Liu et al., 2015(Liu et al., , 2017)).(m-o) Sea surface height (SSH) biases relative to satellite-derived estimates of absolute dynamic topography (AVISO, 2017).
sensitive to a change in atmospheric resolution from 50 to 25 km.This is consistent with the results ofMcClean et al. (2011), who found that increasing atmospheric resolution in the Community Climate System Model had little impact on the large-scale radiation biases.CRF biases in ECMWF-IFS are modified by ocean coupling in regions where cloud properties are sensitive to SST biases, either due to changes in the large-scale atmospheric circulation or because of cloud-SST-radiation feedbacks.For example, ocean coupling modifies the position and magnitude of the short-wave CRF bias in the equatorial oceans surrounding the maritime continent and increases the magnitude of the negative short-wave (and positive short-wave) CRF biases in the subtropical oceans.

Figure 5 .
Figure 5. Annual mean bias in total cloud radiative forcing (CRF) in (a-b) highresSST-present and (c-d) hist-1950 relative to data from Clouds and the Earth's Radiant Energy System Energy Balanced and Filled product (CERES-EBAF) surface fluxes edition 4.0 (Kato et al., 2013).

Figure 8 .Figure 9 .
Figure 8. Mean bias in short-wave surface albedo in (a-b) highresSST-present and (c-d) hist-1950 relative to estimates derived from CERES-EBAF surface fluxes edition 4.0(Kato et al., 2013).Surface albedo is defined to be α sfc = SW up /SW down , where SW down is the short-wave radiation at the surface and SW up is upward short-wave radiation at the surface.

Figure 10 .
Figure 10.Mean bias in zonal-mean temperature in (a-b) highresSST-present and (c-d) hist-1950 relative to ERA-Interim for the period 1981-2010 (shaded) and the climatological mean from ERA-Interim (contours).

Figure 11 .
Figure 11.Mean bias in zonal-mean wind in (a-b) highresSST-present and (c-d) hist-1950 relative to ERA-Interim for the period 1981-2010 (shaded) and the climatological mean from ERA-Interim (contours).

Figure 12 .
Figure 12. (a-b) Meridional ocean heat transport calculated using years 1-50 of the spinup-1950 experiment compared with the observational estimates fromGanachaud and Wunsch (2003).Total heat transport (solid lines) is decomposed into contributions from the mean flow (V • T ; dashed lines) and time-varying "eddies" (V • T ; dotted lines).(c) Stream functions of AMOC at 26.5 • N calculated using years 1-50 of spinup-1950 compared to observations from RAPID-MOCHA array for the period2004-2015(McCarthy et al., 2015)).Dashed lines correspond to model profiles calculated using the RapidMoc Python tool(Roberts, 2017) with a geostrophic reference depth of 4750 m.(d) Strength of AMOC at 26.5 • N against total (circles), overturning (plus signs), and horizontal gyre (triangles) components of the meridional heat transport in spinup-1950.Observed heat transport components are from the RAPID-MOCHA array(Johns et al., 2011), and symbols correspond to annual means.

Figure 13 .
Figure 13.As Fig. 3 but calculated for the first year of spinup-1950.

Figure 14 .
Figure 14.Monthly sea-ice area and volume climatologies for the period 1981-2010 from hist-1950 experiments.Sea-ice areas are compared to observational data from the National Snow and Ice Data Centre (NCIDC; Fetterer et al., 2017).Arctic ice volumes are compared with estimates from the Pan-Arctic Ice Ocean Modeling and Assimilation System (PIOMAS;Schweiger et al., 2011;Zhang and Rothrock, 2003).

Figure 16 .
Figure 16.(a) Histogram of 3-month running mean SST anomalies relative to 1981-2010 in the Niño1+2 region (90-80 • W and 10 • S-0).Frequency of (b) warm and (c) cool event durations determined from the time series plotted in Fig. 15 and additional ensemble members from hist-1950, control-1950, and spinup-1950.Error bars in panels (b) and (c) represent a bootstrap estimate of the standard error in frequency and are derived by resampling from the available events 10 000 times.

Figure 17 .
Figure 17.Scatter plots of SST anomalies from Niño3.4 region against anomalies of top-of-atmosphere outgoing long-wave radiation (OLR) and top-of-atmosphere absorbed solar radiation (ASR).Observation-based estimates are from a composite of satellite radiation products (black crosses; Mayer et al., 2016) and the ERA-Interim reanalysis (grey crosses; Dee et al., 2011).ECMWF-IFS data are for all available HighResMIP experiments and ensemble members.Data are monthly anomalies relative to the 1981-2010 seasonal cycle, and all values are smoothed using a 13-point binomial filter prior to plotting.Black crosses represent all monthly values and red crosses are December values, when ENSO-related anomalies are usually strongest.

Figure 18 .
Figure 18.Covariance of DJF SST anomalies with the normalized time series of SST anomaly in the Niño4 region for (a) HadISST2 and all members of hist-1950 in (b) ECMWF-IFS-LR and (c) ECMWF-IFS-HR.

Figure 19 .
Figure 19.Covariance of DJF rainfall anomalies with the normalized time series of rainfall anomaly in the Niño4 region for (a) GPCP observations (Niño4 precipitation) and ERA-Interim (global precipitation), all members of highresSST-present from (b) ECMWF-IFS-LRA and (c) ECMWF-IFS-HRA, and all members of hist-1950 from (d) ECMWF-IFS-LR and (e) ECMWF-IFS-HR.

Table 1 .
Differences in the NEMO ocean model configurations used in ECMWF-IFS-LR and ECMWF-IFS-MR/-HR.n/a: not applicable.

Table 2 .
Wild et al. (2013))lent energy fluxes at the lower boundary of the atmosphere in observations representative of the early 21st century(Wild et al., 2013)and model experiments for the period 2000-2014.Fluxes are positive downwards and expressed in W m −2 relative to Earth's surface area.Values in parentheses represent the range of uncertainties given byWild et al. (2013).