Benefits of sea ice thickness initialization for the Arctic decadal
climate prediction skill in EC-Earth3

Abstract. A substantial part of Arctic climate predictability at interannual time scales stems from the knowledge of the initial sea ice conditions. Among all the variables characterizing sea ice, sea ice volume, being a product of sea ice area/concentration (SIC) and thickness (SIT), is the most sensitive parameter for climate change. However, the majority of climate prediction systems are only assimilating the observed SIC due to lack of long-term reliable global observation of SIT. In this study the EC-Earth3 Climate Prediction System with anomaly initialization to ocean, SIC and SIT states is developed. In order to evaluate the benefits of specific initialized variables at regional scales, three sets of retrospective ensemble prediction experiments are performed with different initialization strategies: ocean-only; ocean plus SIC; and ocean plus SIC and SIT initialization. The increased skill from ocean plus SIC initialization is small in most regions, compared to ocean-only initialization. In the marginal ice zone covered by seasonal ice, skills regarding winter SIC are mainly gained from the initial ocean temperature anomalies. Consistent with previous studies, the Arctic sea ice volume anomalies are found to play a dominant role for the prediction skill of September Arctic sea ice extent. Winter preconditioning of SIT for the perennial ice in the central Arctic Ocean results in increased skill of SIC in the adjacent Arctic coastal waters (e.g. the Laptev/East Siberian/Chukchi Seas) for lead time up to a decade. This highlights the importance of initializing SIT for predictions of decadal time scale in regional Arctic sea ice. Our results suggest that as the climate warming continues and the central Arctic Ocean might become seasonal ice free in the future, the controlling mechanism for decadal predictability may thus shift from being the sea ice volume playing the major role to a more ocean-related processes.


2003; Serreze and Barry, 2011;Dai et al., 2019). Moreover, the enhanced sea ice melt and associated transports of freshwater to the south weakens the Atlantic meridional overturning circulation and related poleward heat transport on decadal timescales (Sévellec et al., 2017). Therefore, a realistic representation of Arctic sea ice is an essential element of a coupled climate prediction system. 25 Persistence has been recognized as a primary source of Arctic sea ice predictability in the last ten years (Blanchard-Wrigglesworth et al., 2011;Chevallier and Salas-Mélia, 2012;Chevallier et al., 2019): total SIV is the most persistent variable (∼4 yr) compared to local SIC (∼1 month) and total SIE (∼1 season); local SIT is the second persistent, ranging from seasons in the marginal ice zone (MIZ) to approximately a year in the Central Arctic Ocean (CAO). Moreover, the persistence timescale is found to be variable in a climate system due to advection of sea ice anomalies, heat exchange between ocean and atmosphere, 30 and changes in climate forcing, as reviewed by Guemas et al. (2016). For example, the memory of SIE can reemerge beyond its own persistence (e.g. 2 to 5 months) by storing memory in the upper-ocean heat content or SIT (Blanchard-Wrigglesworth et al., 2011), hence initialization of winter SIT anomalies provides some predictive capability for summer SIE with winter preconditioning in several studies (Holland et al., 2011;Chevallier and Salas-Mélia, 2012;Blanchard-Wrigglesworth and Bitz, 2014;Day et al., 2014). As another example, reemergence of sea ice anomalies follows modulation of the upper-ocean heat 35 anomalies: if sea ice retreats anomalously early in spring, more heat is stored in the oceanic upper mixed layer that causes later freeze up (Blanchard-Wrigglesworth et al., 2011).
Sea ice in the Atlantic sector or the Barents Sea can be predicted from a few years to a decade, because of the strong role exerted by the ocean heat advected from upstream and converged on the position of the sea ice edge (Yeager et al., 2015;Årthun et al., 2017;Dai et al., 2020). Some studies suggested that sea ice persistence in the central Arctic can be modulated 40 by oscillation of the Arctic atmospheric circulation between predominantly cyclonic and anticyclonic circulation regimes over timescales of 5-7 years (Proshutinsky and Johnson, 1997;Armitage et al., 2020). From observations, a remarkable oscillation in SIE and SIV is featuring a pause or enhanced ice loss at a period of 7 years, corresponding to some prominent modes of internal variability, such as the winter North Atlantic Oscillation (Bitz et al., 1996;Swart et al., 2015;Gascard et al., 2019).
To date most seasonal prediction systems start from a SIT reanalysis data set (Collow et al., 2015;Dirkson et al., 2017), scales (Guemas et al., 2016). To our knowledge, most studies on predicting regional Arctic sea ice conditions focused on timescales from a month to a few years (Bushuk et al., 2019;Cruz-García et al., 2019;Kimmritz et al., 2019). Few studies have comprehensively examined the impacts of ocean and sea ice initialization with observed anomalies in a climate prediction system with respect to decadal time scales. A recent study addressed this topic by only assimilating sea surface temperature (SST) and hence had a special focus on increased skills by SST in the Arctic marginal ice zone (Dai et al., 2020). 70 The objective of this study is to investigate the decadal prediction skill of Arctic sea ice in the EC-Earth3 Climate Prediction System with anomaly initialization (EC-Earth3-CPSAI) to ocean, SIC and SIT states. We developed a novel method to assimilate local SIV anomalies by initializing both SIC and SIT. As the method is developed as a prototype for our initialization strategy implemented for the Coupled Model Intercomparison Project phase 6 (CMIP6) decadal climate prediction project (DCPP) with EC-Earth3, the present study provides a documentation of the new climate prediction system with anomaly ini-75 tialization including SIT in a multi-category sea ice model framework. It characterizes the performance with focus on the predictions in the Arctic. Three sets of ensemble hindcast experiments are performed and analyzed to evaluate the benefits of respective initialized variables and to quantify the added skill from SIT initialization. This paper is structured as following: Section 2 introduces the climate prediction system EC-Earth3-CPSAI as well as ensemble-experiment design; Section 3 characterizes the initialized climate prediction system with model bias, forecast drift 80 as well as the imprint of initial conditions in the first year. Section 4 evaluates the benefits of specific initialized variables at inter-annual to decadal scales in terms of temporal smoothing and regional mean, respectively. Section 5 is the summary and discussion.
2 Model system and experiment design 2.1 The EC-Earth3 Climate Prediction System with Anomaly Initialization (EC-Earth3-CPSAI) 85 EC-Earth is a state-of-the-art Earth System Model developed by the EC-Earth consortium (Doescher and the EC-Earth Consortium, in preparation). The core of EC-Earth configures consists of component models for atmosphere, ocean and sea ice, so-called AOGCM. In this study we use the officially released AOGCM configuration of EC-Earth model for contributions to the CMIP6, EC-Earth3 (release v3.3.1.1). The atmospheric component is the Integrated Forecast System (IFS cycle 36r4) developed by the European Centre for Medium Range Weather Forecasts (ECMWF). It uses the TL255 horizontal grid (i.e. triangular truncation at wavenumber 255 in spectral space with a linear reduced Gaussian grid, corresponding to a spacing of about 80 km) and 91 vertical model levels with the top level at 0.01 hPa. The ocean component is the Nucleus for European Modelling of the Ocean, version 3.6 (NEMO3.6) embedded coupled to the Louvain-la-Neuve sea Ice Model, version 3 (LIM3, Rousset et al., 2015). The NEMO-LIM3 is configured with the ORCA1 tri-polar nominal 1 • grid and has 75 vertical levels. It is worth noting that the sea ice model LIM3 applies an ice thickness distribution framework to deal with meter-scale variations 95 in ice thickness (Rousset et al., 2015). Unlike its earlier version (e.g., LIM2), LIM3 allows for five ice thickness categories to account for the non-linear dependence of sea ice processes, in particular growth and melt, on ice thickness.
The EC-Earth3 has been used to generate a 25-member ensemble of the CMIP6 historical (1850-2014) and future (2015-2100) scenario simulations following the CMIP6 protocol (Eyring et al., 2016) by the EC-Earth consortium. The ensemble is generated by starting the CMIP6 historical experiment from 25 different dates of a control run with pre-industrial forcing 100 condition. We arbitrarily select one member (r5i1p1f1, hereafter referred to as FREE1) from the ensemble to obtain the model climatology for the ocean and sea ice used in the anomaly initialization. We combine FREE1 and other 4 members to compose a 5-member ensemble (hereafter referred to as FREE) for forecast skill assessment (see Table 1 and sections 2.2 ). The AI method in decadal climate prediction was formulated by Pierce et al. (2004). This approach has already been applied for the ocean component of the earlier EC-Earth decadal prediction experiments in CMIP5 based on EC-Earth2.3 and using 105 the ORAS4 ocean reanalysis as observational basis (Hazeleger et al., 2013). Since ORAS4 does not provide sea ice data, sea ice anomalies for this previous exercise were obtained from a stand-alone simulation with the ocean-sea ice component of the due to updates in the assimilated observation data sets, such as sea ice satellite data since 1979. A brief comparison of the two generations of EC-Earth decadal prediction system with AI is provided in Table S1.
In anomaly initialization, the initial state is generated using the observed state but replacing the observed initial climatology with the modelled initial climatology. Here the observed (modelled) initial climatology at a starting date (i.e., climatology of 115 November 1) is calculated as an average of the climatological monthly means between the two nearest months (i.e, October and November), over the period 1979-2014 for ORAS5 (FREE1), as can be seen in Table 2. Horizontally ORAS5 data is bilinearly interpolated from 0.25 • to 1 • ORCA grid. The initialized variables in the ocean model are three dimensional temperature and salinity. To avoid initial inconsistency in the large-scale dynamics of the system, we do not initialize horizontal velocities. This is a common approach for initialized hindcasts/predictions (see Table 1 in Polkova et al., 2019). The sea ice variables initialized in EC-Earth3 are ice concentration, ice and snow volume (denoted as A ice , V ice and V sn ) in five thickness categories at a grid-cell level. However, SIC and SIT from the ORAS5 reanalysis are single values for the grid-box mean as they are assimilation products using the sea ice model LIM2. Therefore, the volume from ORAS5 is first calculated by multiplying thickness for ice and snow (SIT and SNT) with SIC. A comparison to climatology yields its anomaly.
If added to the model climatology, A ice can in some cases violate the valid range (0-100 %). Therefore, a few adjustments are some assimilate satellite SIC in a multivariate data assimilation scheme so as to update SIT instead of directly assimilating it (Massonnet et al., 2015;Kimmritz et al., 2018), while others use forecast tendencies, namely maintaining the distribution of volume between the categories, by multiplying each category volume with the ratio of observed over modelled mean SIT or similarly by nudging towards observations across each category (Allard et al., 2018;Blockley and Peterson, 2018). However, there is no unique solution, since the multiple sub-grid scale configurations can be compatible with one total SIV. For decadal 135 prediction with EC-Earth3-CPSAI, we develop a novel method with 1) a weighting function mapping single-category SIC onto multiple categories; 2) a multi-category thickness distribution depending on concentration levels; 3) both are used when converting the initial volume (i.e. V ice,sn ) at the grid-cell level to its sub-grid, while thickness in the last category is determined with a constraint of V ice,sn being conservative.
To split A ice into different categories g ice l (where l = 1 − L, denotes the ice category with a total number of categories L=5 140 in LIM3), the weight-likelihood function is derived from the modelled SIC in a 300-year pre-industrial control run with EC-Earth3 (denoted with superscript "ctrl" hereafter). Data for November 1 was calculated by averaging October and November monthly means. For SIC at a specific grid point (i.e. A ctrl has only time dimension), tn is the time index over 300 years, at which A ctrl (tn) has the minimal difference from A ice . We assume that based on the same EC-Earth3 model version, g ice l is likely regulated by the weighting function weight ctrl l . Thus for each grid point, We aim at imposing local SIV anomalies to LIM3 while keeping the sum of volume over all thickness categories unchanged as in Eq.(4). We use the 300-year control run to establish the relationship between ice thickness in the five ice categories (h ctrl l ) and the total ice concentration A ctrl . Different from weight ctrl l in Eq.
(2), Fig. 1 plots the h ctrl l -A ctrl histogram for all grid points in the control run, with A ctrl ranging from 10 to 100 % at intervals of 10 %. It can be seen that the thickness distributions for h ctrl l (l = 1 − 4) are quite robust within each bin, suggesting that the more ice-covered (e.g. A ice >70 %), the lower variance 155 in thickness, in other words, lower probability to melt and shift to neighboring bins. We neglect SIT initialization when A ctrl is below 10 %, both because statistically these grid points only account for 8 % of total ice-covered ones and because an observation error of 10 % is often assumed while assimilating SIC (Mathiot et al., 2012).
Following the h ctrl l -A ctrl histogram, except for the last category, h ice l (l ≤4) are determined by the corresponding bin of A ice , while the volume in each category is determined by v ice l = g ice l h ice l . Then the residual of V ice will be accommodated in the last category (l = L = 5) as v ice L in Eq.(5) and h ice L is contingently resolved. This method imposes the combined anomalous signals of SIC and SIT to sea ice initialization. The same method is applied to discretize snow volume with a multi-category snow thickness-A ctrl histogram (not shown). and standard deviation (error bar), based on the knowledge from a 300-year climate simulation with pre-industrial forcing. It is used to discretize h ice l , if the total ice concentration and volume (A ice and V ice ) at a grid cell is know from anomaly initialization. F. ex. for grid cells with A ice between 10-20 %, h ice l (l =1-5) is 0.08, 0.27, 0.62,1.15 and 3.20, respectively. The ice volume at respective category (l =1-4) is splitted from A ice to each category by using a weight-likelihood function (see Eq.1-3). For the 5th category, h ice . Note that the grid-box mean thickness Therefore, Hm is regulated by both V ice and g ice l at the local grid.
A consistency check, also called "sanity check", is carried out for the ICs of the ocean and sea ice model, in order to adjust water and heat flux-relevant variables in the surface boundary fields to the initialized g ice l ,v ice l and v sn l in a physically consistent way. The method has been used in different sea ice prediction systems (Massonnet et al., 2015;Kimmritz et al., 2018). To complete the initialization, an one-day spin up integration is performed with a reduced time-step of 100s and introducing a mask of coastal water (< 100 m deep) to neglect SIT initialization there (indicated in Fig. 2b). This is to ensure dynamical consistency 170 in the initial states and avoid numerical instabilities. The practice of masking some locations out in SIT initialization has been used in many studies, for example by introducing a threshold of SIC>40 % to implement full-field initialization (Blockley and Peterson, 2018) or confining modification of SIT to the Arctic basin with a geographic weighting mask to discard initialization changes in the marginal ice zone (Blanchard-Wrigglesworth et al., 2017).

Sensitivity experiments with sea ice initialization
The above described AI approach for both ocean and sea ice ICs is hereafter referred to as AI2. A five-member ensemble of the AI2 experiment is performed in this study, with 5 sets of ocean and sea ice ICs generated using the anomalies of five individual members of ORAS5, respectively. For the atmosphere a full-field initialization based on ERA-Interim (hereafter as ERAI, Dee et al., 2011) is implemented. Initialized ensembles of predictions (re-forecasts) start yearly on November 1 for the period from 1979 to 2018 (a total of 40 start dates) and run 2 months plus 10 years long with the external forcing following the CMIP6 180 DCPP protocol for dcppA-hindcast and dcppB-forecast experiments (Boer et al., 2016). We generated 10 additional members by means of perturbed atmospheric ICs. This whole ensemble with a total of 15 members states a contribution to CMIP6 DCPP with EC-Earth3-CPSAI (see Table S1). However, for this study only the first 5 ensemble members (with unperturbed atmospheric ICs) are used and compared to complementary 5-member ensemble predictions described in the following.
Our primary interest is the impact of Arctic sea ice on decadal prediction skill in the last two decades, which is expected to 185 be more representative for the coming decades than an Arctic with large amounts of thick MYI in the last century. In order to investigate the sensitivity of decadal prediction skill to SIT initialization in EC-Earth3-CPSAI, two more initialized ensemble experiments are performed with ocean-only initialization (AI0) and ocean plus SIC initialization (AI1). We only performed 5 members with AI0 and AI1 for sensitivity analysis, and thus for a fair comparison this study focuses only on 5 members for AI2 and FREE, too. For AI2, our assessment here uses the five members initialized with the five members of OARS5. FREE 190 consists of FREE1 and other four members from the 25 member ensemble of the EC-Earth CMIP6 historical  and the corresponding SSP2-4.5 (2015-2017). These four members are deliberately selected to represent the wide range of natural variability in the EC-Earth3 CMIP6 control experiments from which the ensemble of EC-Earth historical simulations starts ( Fig. S1a). An assessment of the overall feature of FREE shows no significant difference between FREE and the full ensemble of 25 members (e.g. Fig.S1b), even though the regional differences could be large. A summary of all experiments 195 is given in Table 1. The ensemble-mean of the AI experiments versus that of FREE are used to evaluate the impact of the respective initialization approach. The difference between AI2 and AI0 is used to assess the added prediction skill with sea ice initialization, while the comparison between AI2 and AI1 is used to evaluate the skill gained by SIT initialization.

Skill assessment
As reference fields (REF), the climatological annual and seasonal means are calculated as 20-year averages for the period from 200 1997 to 2016 based on monthly means of ORAS5 for sea ice and ERAI for the near surface air temperature (TAS), respectively.
We note that the reference data is not from another independent observation product, but assimilation of observations using models which are close to those in EC-Earth3. There are several reasons: 1) ORAS5 represents Arctic sea ice reasonably well compared to other ocean reanalysis, thanks to direct assimilation of SST and SIC observations with 5-d assimilation window (Tietsche et al., 2017;Uotila et al., 2019;Zuo et al., 2019); 2) the aggregated quantities (e.g. SIE and SIV) are not sensitive 205 to models, and the errors of ORAS5 fall within the range of observation uncertainties (up to 10% in SIC and SIV, Schweiger et al., 2011;Chevallier et al., 2017); 3) ORAS5 and FREE are both in ORCA grids, which avoid spatial errors potentially being either masked or enhanced by remapping from observation-to model-grid; 4) Similar to Volpi et al. (2017), our main focus is to evaluate the relative skills between different initialization methods with EC-Earth3-CPSAI, compared to the skill of FREE.
Our assessment is based on forecast anomalies (Table 2) for specific lead time, following the method by Volpi et al. (2017). The confidence interval is calculated with a t-distribution for the ACC, and with a χ 2 distribution for the RMSE. The assessment of temporal development is provided at different lead times: year 1, years 2-5, years 6-9, and years 2-9. The latter three averaging periods are selected as basic choices to verify skill 220 dependence on averaging and lead time (Goddard et al., 2013).
Our analysis focuses on three key variables, SIC, SIT and TAS in the Arctic. Our assessment for total SIE and SIV is applied over the Northern Hemisphere region (NH) with a typical threshold 15 % SIC and 0.15 m SIT, respectively, to exclude the extensive areas of open water (Schweiger et al., 2011). TAS index is computed as field average over the polar cap domain, namely the region north of the 70 • N circle (see Fig. 2). The data analyzed for this study are briefly summarized in Table 1.

225
Regional skills are assessed for ten sub-regions (shown in Fig. 2b), representing the sector for the CAO and its adjacent waters, namely the central Arctic (80 • N north) and the Laptev/E. Siberian/Chukchi/Beaufort Seas, and the sector for the Arctic MIZ and the transition waters, namely the Kara/Barents/GIN (Greenland/Iceland/Norwegian) Seas as well as the Baffin/Hudson Bays. The CAO sector is coincidently confined by the climatological September ice edge (see Fig. 2a and 2b), implying the dominant influence of thick MYI due to geometric constraints of Arctic coastlines. By contrast, the MIZ sector adjoins 230 the North Atlantic Ocean, climatologically covered by thinner seasonal ice in winter (except for the east coast of Greenland receiving thick ice transported from CAO through Fram Strait). Canadian Archipelago and the Pacific side of MIZ are not presented in regional assessment, however, they are counted in the assessment for the total Arctic SIE and SIV.
3 Characterization of the initialized climate predictions with EC-Earth3-CPSAI

Components of sea ice initialization 235
The model bias for the initial dates (i.e. November 1) is calculated over the period 1979-2014 between the modelled and reference mean state that are used for obtaining the sea ice anomaly initialization (formulated in Table 2). SIC is generally overestimated (i.e. blue areas dominate over the red ones in Fig. 2a biases are mostly located outside of the observed summer sea ice cover, illustrated by the climatological September sea ice edge of ORAS5 (i.e. REF). As a result, the modelled climatology of September sea ice is also overestimated over the GIN/Barents Seas, consistent with 1-2 m thicker ice as seen in Fig. 2b. This indicates that the sea ice biases over rather large areas in the GIN Seas originate from the modelled MYI. Compared to FREE1, FREE (i.e. ensemble mean) shows similar patterns of model bias but with increased magnitudes (due to a large spread in the FREE ensemble, not shown).   In the next section, we focus on the impact of Arctic sea ice initialization with negative anomalies on decadal prediction skill during recent two decades. As the climate warming continues and the CAO might become seasonally ice free in the future. Our goal is to shed light on how the key mechanism governs the decadal predictability of the Arctic sea ice in the coming decades. An backwards extension to 1987 are shown in dashed lines. Note that the skill assessment period is from 1997 to 2016 (see Table 2), where Y1 denotes forecast initialized from 1996, ..., 2015 and Y10 denotes the 10th year forecast initialized from 1987, ..., 2006. Thus the initial anomalies are more positive for predictions with longer lead years assessed, as can be seen when compared to those in Y1.  Table 2. To compare, the monthly biases of FREE (FREE1) for the respective variables were shown as gray (black dashed) lines and filled areas in Fig. 4a-c, which are simply repeated annual cycles averaging for the same period (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016) with annual mean bias of 0.4 (-0.2) million km 2 , 5.6 (2.3) thousand km 3 and -0.7 (0.1) K for SIE, SIV and the Arctic cap TAS, respectively. It is by chance 260 that FREE1 (black dashed line) has smaller bias in SIE, SIV and TAS over the Arctic than the ensemble mean of FREE (thick gray line in Fig. 4, also see FREE bias in annual min/max sea ice and mean TAS in Fig. S2 and S3).

Forecast drift
A general feature of all three variables is that biases in all initialized experiments (AIs) vary in the range between FREE1 and the ensemble mean of FREE, so that initialization results in a positive annual mean bias to the SIE and SIV (i.e. larger SIE and SIV) and negative annual mean bias to TAS with respect to FREE1, taking the annual mean bias of AI2 (blue dots) as an 265 example. There is a slight tendency that the forecast error increases for longer lead time. The larger biases in AIs experiments than FREE1 result from anomaly initialization where the model bias of FREE1 is not removed from the initial state, but surplus with more positive anomalies of sea ice for prediction with longer lead years assessed, compared to those in the first year forecast over the period 1997-2016 (i.e. Y1 in Fig. 3). The differences between AIs are generally small after Y1, indicating the forecasts are drifted toward the model climate (as represented by the ensemble mean). And the forecast error for SIE is 270 relatively less sensitive to different initialization than that of SIV and TAS. On average the long-term forecast drift is small as indicated by the annual mean errors in AI2 (blue dots in Fig. 4). There is a slight decline in both SIE and SIV from Y1 to Y3 followed by a return between Y5-Y7, and a tendency of larger SIE and SIV for longer lead time. Correspondingly, TAS in the initialized hindcasts is gradually pulled to the colder climate over the Arctic cap domain, showing a tendency of negative bias in all AIs in both winter and summer for longer lead time, i.e. Y8-Y10.

275
The model biases show rather strong seasonal cycles, with generally smaller biases in winter but much larger in summer for all three variables in AIs than FREE. These results emphasize the importance of drift-correction via correcting the lead-time dependent bias for multi-year prediction skill assessment. Among all AIs, AI1 (pink line) shows the least magnitude of positive seasonal bias of SIE and SIV from Y6 afterwards, and correspondingly the warmest TAS. It seems that, for longer lead times (>Y5), AI1 performs closest to FREE1. It suggests that AI1 with ocean plus SIC initialization imposes less strong constraints 280 on the development of the sea ice forecast than ocean-only (AI0) and all (AI2) initialization at decadal timescales.

Imprint of initial conditions in the first year
In this section we examine the immediate benefit (degradation) from initial anomalies (ICs inconsistency) due to different initialization strategies.  For SIC (Fig. 6, left panels), the skill score (∼0.05) appears to be unaffected by any initialization in the central Arctic,

AI2
All init.  Comparing skill scores for SIT (Fig. 6, middle panels), RMSESS>0.2 is commonly found poleward of the September sea ice edge determined by FREE (Fig. 2a), suggesting that the higher skill of AI2 compared to FREE, AI0 and AI1 in the region is 315 related to higher persistence of thick MYI (Kwok, 2018). Low skill in the E. Siberian/Laptev Seas is because SIT initialization is not applied in AI2 to these regions due to the shallowness (stippled in Fig. 6 In summary, the direct improvement in RMSESS for the first winter prediction from AI0 is SIC in the MIZ sector, from the Hudson Bay eastwards to the Barents Sea. The added value from the ocean plus SIC initialization (i.e. AI1) is small and the only way to improve the SIT skill is to impose local SIV anomalies to the initial sea ice state (i.e. AI2 methods). Taking the sea ice model bias from FREE for March (in Fig. S2a and b) into account, the SIT initialization seems more effective on improving 335 the first winter prediction skills for regions where the model tends to have too much ice (which is mainly the North Atlantic) while it works less well in regions where the model has too little ice (the Pacific Arctic sector).

The first 12 months forecast
As the persistence timescales of upper-ocean heat content and SIT can be longer than one season, we continue to identify the relative contribution from ocean and sea ice initial constraints as the dominant source of predictability in the first forecast 340 year. The prediction skill in the first 12 months, counted from the start month (i.e. November), is evaluated by analyzing the temporal development of ACC and RMSE relative to REF (as Eq.1-2 in Volpi et al., 2017). The definition of forecast and reference anomalies is given in Table 2. The monthly mean of total SIE and SIV are calculated and extracted from the first 12 months forecast initialized on November 1 from 1996-2015 as indicated in Fig. 3. In the same way, the first 12 months forecasts of TAS is averaged over the polar cap domain (see Fig. 2a) in order to investigate the direct impact from MYI in the d). When the correlations of AIs meet, or even fall below FREE in Fig. 7, it means that there is no skill from the initialization any more. On the contrary, when the RMSE of AIs exceeds the FREE one, it indicates no benefit from initialization afterwards.
For SIE, all AIs show significantly smaller RMSE than FREE until June, but with similar ACC throughout of the year (Fig. 7a, d). For SIV, AI2 shows significant reduction of RMSE from the first months to March-April, and remains low until 350 August, even though not much difference in ACC except the first month (Fig. 7b, e). For TAS, there are hardly any differences among the initialized or uninitialized runs for both RMSE and ACC (Fig. 7c, f). In summary, with respect to SIV, only AI2 (all init.) outperforms FREE for the first 8-10 forecast months. This suggests the importance of SIT initialization with constraint on SIV anomaly in EC-Earth3-CPSAI. The increased skill for SIE by initializing SIT is only prominent in the first winter and very little for longer lead time into the year. The similarity between AI0, AI1 and AI2 suggests the major contribution of ocean 355 initialization to reduce RMSE until lead month 11. Similar results are also seen in the study by Dai et al. (2020) in which the coupled climate prediction system NorCPM only assimilating SST can skillfully predict pan-Arctic SIE up to 12 months. In this section we focus on understanding the origin of decadal predictability and the relative contribution from the upperocean heat content and local SIT. For simplicity, the results of AI1 are not discussed in the following as the improvement 360 by SIC initialization is seen limited in comparison with ocean-only initialization (Fig. S4-S7). The initial year is excluded in this section because the imprint of initial conditions is strong and discussed above. Following the verification framework for interannual-to-decadal prediction experiment (Goddard et al., 2013), the temporal information is smoothed at different scales: years 2-5, years 6-9, and years 2-9.

365
Prediction skill is assessed based on the average over forecast years 2-9 for AI0 and AI2 and compared to the respective FREE projection. Figure 8 shows that the areas with significantly positive correlation (>0.5) are larger in the year 2-9 than that for the first winter (Fig. 5) for all three variables and all experiments. It is because temporal smoothing will typically reduce higher frequency noise and increase signals resulting from external forcings such as increase of greenhouse gas concentrations.
Therefore, FREE in the top panels of Fig. 8 shows high anomaly correlation in both SIT and TAS in the CAO region, likely With respect to RMSE in FREE (Fig. 9, first row), the spatial patterns derived from the year 2-9 average are very similar to 380 those of first winter means shown in Fig. 6 but the magnitude is smaller. There is a strong contrast of forecast errors between the CAO and the GIN/Barents Seas in Fig. 9, which is coherent with the positive (negative) bias in the annual sea ice maximum (minimum) in the GIN (Barents) Seas over 1997-2016 (Fig. S2). Both observations and climate models have suggested that in these regions the variability of sea ice is largely influenced by the oceanic heat flux convergence, therefore, the prediction errors can be greatly reduced by advection of ocean temperature anomalies whereas little benefit from SIT initialization is expected 385 (Koenigk and Mikolajewicz, 2009;Årthun et al., 2017;Onarheim et al., 2018;Bushuk et al., 2019;Dai et al., 2020).
The improvements in AI2 (all init.) compared to FREE for SIC decadal prediction are presented in the skill map of AI2/FREE ( Fig. 9, left). The high RMSESS (>0.2) seen in the MIZ from the first winter mean (Fig. 6) is not only maintained here in years 2-9 but also becomes prominent in the central Arctic and the Pacific Arctic sector. However, AI2 shows limited improvement for regions where RMSE is highest in FREE (as mentioned in the paragraph above). With respect to SIT, the regions with 390 improved skill (RMSESS>0) in AI2 compared to FREE are shifted from the western Arctic in the first winter (Fig. 6) to the eastern Arctic in years 2-9. Comparing the lower panels of Fig. 9, the RMSESS of AI2/FREE for both SIC and SIT in the North Atlantic sector are much higher than those in AI2/AI0. This suggests that ocean initialization is the most important source of predictive skill at decadal time-scale, however, sea ice initialization holds some additional value in this respect. The positive effect of initialization (seen in both AI2/FREE and AI2/AI0) on sea ice skill between Chukchi Sea and Laptev Sea is 395 only prominent in the year 2-9 average, but absent in the first-winter mean. This manifests as the regional benefit on longer time-scales, rather than a direct impact from sea ice initialization in the first winter prediction.

SIC FREE
ACC: years 2-9 SIT TAS No init.

AI2
All init. Basin) is seen to benefit more from the initialization than the east side of SPG. Such a zonally asymmetric pattern is consistent with the leading mode of observed sea surface height (SSH) variability in the North Atlantic (Koul et al., 2020), suggesting large scale atmospheric response to ocean initialization. The added skill by sea ice initialization represented by RMSESS of AI2/AI0 appears also prominent in the landward vicinity of the Barents Sea and the E. Siberian Sea. This is consistent with the enhanced correlation of AI2 relative to AI0 (Fig. 8).  The skill of years 2-5 (see Fig. S4-S5) are very similar to those of years 2-9 ( Fig. 8-9). After forecast year 1, the skills gained by AI2 initialization relative to FREE seem not to be sensitive to the averaging time-scales from multi-year to a decade. In examining the RMSESS (AI2/AI0) dependence on lead time via comparison of skills for years 2-5 versus 6-9 ( Fig. S6-S7), it is found that the added skill for SIC and TAS is more prominent in years 6-9 than in years 2-5 in the North Atlantic sector (e.g. 20 https://doi.org/10.5194/gmd-2020-331 Preprint. Discussion started: 14 December 2020 c Author(s) 2020. CC BY 4.0 License.
the GIN/Barents Seas), whereas it is the opposite in the Beaufort Sea. However, these small differences of regional skill are averaged out at the decadal time-scale.

Regional-mean skill matrix
The results from sections 3.3.1 and 4.1 provide important insights into the regional variability of sea ice prediction skill at different time-scales. It is clear that the benefit from SIT initialization in shaping the local sea ice development do not last beyond the first summer, but its impact can on remote regions which may last for several forecast years resulting from accurate 415 representation of ocean dynamics. In the final analysis, we aim at providing details on the temporal evolution of the skills of initialized hindcasts for the ten sub-regions of the Arctic. The RMSESS are calculated on monthly basis for each grid points in years 1-9 and then the area means are used to represent its regional skill for the ten regions in the Arctic as defined in Fig. 2b.
In Fig. 10 Among the ten regions, the Barents Sea is the most beneficial region due to AI2, with the highest score (> 0.5) from July to next February in several forecast years ahead. The positive difference in the right column indicates enhanced skills from AI0 to AI2 between August and October in particular for longer lead time of 5-6 years. The second most beneficial region is the the sea ice initialization for the skill is found in the CAO and its adjacent waters.
In contrast to SIC, the skill of SIT is high (RMSESS∼0.5) in both CAO and MIZ regions for several years (Fig. 11). This suggests the direct improvement from SIT initialization. Overall, sea ice initialization (AI2) adds more skill than ocean-only initialization (AI0) in predicting SIT in all years in the GIN Seas. In the CAO region, the added skill (AI2/FREE-AI0/FREE, right panels in Fig. 11) emerges in years 1-3 in the central Arctic and years 1-4 in the Beaufort and Chukchi Seas, and then 440 reemerges in years 5 and 7, respectively. Interestingly AI2 exhibits 12-month lag of added skill in the Laptev/E. Siberian Seas.
It corroborates with the finding in the RMSESS maps with no skill in first winter but high skill in years 2-9 (see AI2/AI0 in 21 https://doi.org/10.5194/gmd-2020-331 Preprint. Discussion started: 14 December 2020 c Author(s) 2020. CC BY 4.0 License. Fig. 6 versus Fig. 9). Their regional SIT predictability may be attributed to advective processes or winds (Guemas et al., 2016), because SIT initialization is neglected due to the shallowness (see Fig. 2b).
Figure 10. Regional Arctic SIC prediction on decadal time-scales: RMSE skill score of AI2 relative to FREE (AI2/FREE, two columns on the left) and the difference of skill score between AI2 and AI0 (two columns on the right). The ratio of RMSE in AI2 to FREE is averaged over regions, and then AI2/FREE is calculated as 1-(RMSEAI2/RMSEF REE ). White colors denote 0 score, which means RMSEs in AI2 and FREE are equal. The difference is calculated as AI2/FREE -AI0/FREE. Red (blue) colors denote higher (lower) skill score than the reference.
In summary the difference of RMSESS between AI2 and AI0 for both SIC and SIT indicate that the impact of sea ice 445 initialization on reducing sea ice forecast errors is not just limited to the first few years locally, but can also reemerge after 5-7 forecast years in the CAO regions extending to the transition zone of the Kara Sea, suggesting a combined effect of ocean advection and sea ice initialization. The skill gained by initializing ocean can be enhanced by SIT initialization on timescales ranging from few years to a decade. Additionally, the RMSESS matrix for the regional SIE and SIV are provided for each lead month in Fig. S8-S9 and the results from RMSESS are generally consistent with the skill assessment in Fig. 10-11. There is 450 23 https://doi.org/10.5194/gmd-2020-331 Preprint. Discussion started: 14 December 2020 c Author(s) 2020. CC BY 4.0 License. generally significant skill in regional SIV in the CAO regions in the ice-growth months during the forecast years 1-9. Only the Barents Seas shows significant skill in both SIE and SIV from January to June in years 1-9. Otherwise there is almost no significant skill in the MIZ regions and in regional SIE.

Discussion and conclusions
This study addresses the following questions using the global climate model EC-Earth3: can SIT initialization improve the 455 Arctic decadal prediction skill? Where and when may the prediction of regional seas benefit from SIT initialization? Three predictability regimes are classified according to added skill by ocean (AI0-FREE) and SIV (AI2-AI0) anomalies in the initialization: -  Fig. 6 versus Fig. 9), namely the CAO sector inside the September climatological ice edge (Fig. 2a). This highlights the persistence of local thick ice.

465
Theoretically, the variability of thick ice has little connection with the upper ocean, due to the insulating role played by the sea ice cover during most of the year (Flato, 1995). From observations, the variability and trend of perennial SIE are found to be the largest in the September sea ice minimum, in contrast to seasonal SIE which characterizes as ice-free in summer and by the largest variability and trend in winter (Onarheim et al., 2018). Here, we found a clear evidence of a winter-preconditioning mechanism in the central Arctic with increased skill for SIC in the following summer, which 470 drops in years 3-4 and reemerges in years 5-7. Furthermore, the emergence of skill in SIC and SIT in the Laptev/E.
Siberian Seas from the second winter onward may demonstrate the role of the SIT anomalies in the central Arctic over the CAO and a combined effect of advection processes on reshaping the spatial pattern of sea ice in these two regions.
-In the Atlantic side of MIZ with predominant seasonal ice, variability and trend of SIE is largely influenced by oceanic heat flux in winter. Therefore, the errors in predictions can be greatly reduced by advection of ocean temperature anoma-475 lies. Many global climate models have shown that the upper ocean heat content significantly contributes to the prediction skill of sea ice in the MIZ (Bushuk et al., 2019;Cruz-García et al., 2019;Dai et al., 2020). Therefore, little improvement from local SIT can be expected in these regions. Consistent with their results, we found that there is significant degradation of SIC skill in the Hudson Bay in the first winter prediction in AI2 compared to AI0. Similarly SIT initialization seems not improve the local SIC prediction in the Baffin Bay. Among these regions, the prediction skill of SIC in the 480 Barents Sea is the highest and up to 7 years during the melt-to-growth seasons (July to next February in Fig. 10). This is likely attributable to the persistence of SST anomalies and advection of ocean temperature anomalies from the North Atlantic Ocean, as there is no direct increase in skill from SIT (AI2-AI0) in the first winter months (Fig. 6). In terms of ACC for years 2-9, significant skill in predicting SIC is found mostly in the eastern Arctic Basin from E. Siberian Sea to the Barents Sea at decadal-scales (Fig. 8). All hindcasts show generally higher skill of SIC in the Arctic MIZ in the 485 Atlantic Arctic sector than in the Pacific Arctic sector, consistent with the results in Dai et al. (2020).
-In the GIN Seas, best skill in September SIT (∼ 9 years) and winter SIC (∼ 7 years) are associated with advection of SIT anomalies from the central Arctic and SST anomalies from the Atlantic Ocean, respectively. In summer, there is only ice in the EGC coming from the north through Fram Strait. In winter, the ice thickness in the EGC is still determined by the ice transport through Fram Strait but the extension of the ice cover might be dominated by upper surface temperature and 490 ocean circulation. Regarding the temporal evolution of regional means, AI2 improves the accuracy of SIT predictions in all years, particularly for the September minimum, suggesting a remote origin of MYI from the central Arctic in Fig. 11.
In contrast, the best skills of SIC in the GIN Seas are gained in the winter months from October to February (Fig. 10), appearing in the marginal ice zone adjacent to CAO. The GIN Seas may actually be divided into a region in the west at the Greenland coast, which is dominated by MYI coming from CAO and a region in the east, which behaves rather 495 similar to the Barents Sea: ice melts in summer and winter ice depends on winds and ocean currents.
The added value of SIC initialization is limited in comparison with ocean-only initialization (i.e. AI1-AI0). As SIV is calculated as a product of SIC and SIT, only assimilating SIC will result in a synchronous correlation between the total sea ice area and SIV (of the same month) in the initial states, which is found to be weak both from a multi-centennial preindustrial simulation (Chevallier and Salas-Mélia, 2012) and from the observed anomaly (i.e. ORAS5 reanalysis in Fig. 3). In turn, the 500 incorrect initial SIV fields can affect the forecast errors of sea ice through the melting season and thereby degrade prediction skill of SIE (Blockley and Peterson, 2018;Kimmritz et al., 2018). Therefore, assimilating SIC alone in EC-Earth3-CPSAI may not be sufficient to constrain winter SIT variability to predict decadal variations in SIE and SIV under Arctic warming.
To conclude, our sensitivity experiments with EC-Earth3-CPSAI by imposing different initialized model components demonstrate that AI2 (all init.) yields an improved performance for decadal prediction for the Arctic regions, as it provides an im-505 provement in predicting SIE and SIV anomalies and reducing errors in regional sea ice states. As climate warming continues, the central Arctic that is covered mostly by MYI will likely become seasonal ice free in the future. The controlling mechanism for decadal predictability in the region may thus shift from the current SIV persistence dominated regime to a more ocean-related processes. These findings state the foundation for the AI2-approach being the choice for a full contribution to CMIP6-DCPP covering 60 initializations (1960-2019) with 15 ensemble members each. A more general assessment of this 510 system's predictive skill beyond the Arctic is currently in preparation.
Code and data availability. The EC-Earth model (version 3.3.1.1) with its standard coupled model configuration (T255L91-ORCA1L75) is used for the experiments here. The entire code of EC-Earth is not available due to restrictions in the distribution of the atmosphere component IFS. Confidential access to the entire code can be granted for editors and reviewers; please use the contact form at http://www.ecearth.org/about/contact. For the methods of anomaly initialization to ocean and sea ice, we followed the approach described by Hazeleger et al. (2013), namely by adding observation anomaly to model climatology; this can be implemented at one command line with the utility of Climate Data Operator (CDO). The programmes used to convert one-category sea ice initial states (i.e. SIC, SIT and SNT) to 5 categories and the scripts used to produce the figures are available at https://doi.org/10.5281/zenodo.4297603 (Tian et al., 2020). Data used in this paper are available at https://doi.org/10.5281/zenodo.4297926 (Tian, 2020). Links to model output of sensitivity experiments can be found from the aforementioned URL. The CMIP6 data (e.g. FREE and AI2) can also be downloaded from any Earth System Grid Federation (ESGF)