Articles | Volume 16, issue 13
Model evaluation paper
12 Jul 2023
Model evaluation paper |  | 12 Jul 2023

The mixed-layer depth in the Ocean Model Intercomparison Project (OMIP): impact of resolving mesoscale eddies

Anne Marie Treguier, Clement de Boyer Montégut, Alexandra Bozec, Eric P. Chassignet, Baylor Fox-Kemper, Andy McC. Hogg, Doroteaciro Iovino, Andrew E. Kiss, Julien Le Sommer, Yiwen Li, Pengfei Lin, Camille Lique, Hailong Liu, Guillaume Serazin, Dmitry Sidorenko, Qiang Wang, Xiaobio Xu, and Steve Yeager

The ocean mixed layer is the interface between the ocean interior and the atmosphere or sea ice and plays a key role in climate variability. It is thus critical that numerical models used in climate studies are capable of a good representation of the mixed layer, especially its depth. Here we evaluate the mixed-layer depth (MLD) in six pairs of non-eddying (1 grid spacing) and eddy-rich (up to 1/16) models from the Ocean Model Intercomparison Project (OMIP), forced by a common atmospheric state. For model evaluation, we use an updated MLD dataset computed from observations using the OMIP protocol (a constant density threshold). In winter, low-resolution models exhibit large biases in the deep-water formation regions. These biases are reduced in eddy-rich models but not uniformly across models and regions. The improvement is most noticeable in the mode-water formation regions of the Northern Hemisphere. Results in the Southern Ocean are more contrasted, with biases of either sign remaining at high resolution. In eddy-rich models, mesoscale eddies control the spatial variability in MLD in winter. Contrary to a hypothesis that the deepening of the mixed layer in anticyclones would make the MLD larger globally, eddy-rich models tend to have a shallower mixed layer at most latitudes than coarser models do. In addition, our study highlights the sensitivity of the MLD computation to the choice of a reference level and the spatio-temporal sampling, which motivates new recommendations for MLD computation in future model intercomparison projects.

1 Introduction

The ocean mixed layer is the interface between the ocean interior and the atmosphere or sea ice. It is a layer of thickness ranging from a few meters to hundreds of meters, homogenized in the vertical by turbulence driven by wind, buoyancy, and/or waves (D'Asaro, 2014). Because of the existence of this turbulent layer, fluxes from the atmosphere or sea ice modify the ocean properties not only at the surface but also over the thickness of the mixed layer. This makes the mixed-layer depth (MLD) a key variable for Earth's climate, as it controls the relationship between air–sea fluxes and sea surface temperature and thus influences climate feedback mechanisms. In the mixed layer, potential density is relatively homogeneous, compared to the stratification below.

The density stratification (pycnocline) that begins at the base of the mixed layer can be due to vertical gradients of temperature or salinity (Helber et al., 2012). Most often, the mixed-layer base is formed where the cooler pycnocline meets the warmer waters directly heated by air–sea fluxes and where deep-ocean heat uptake on longer-than-seasonal timescales is affected by mixed-layer detrainment. In the tropics and polar seas, “barrier layers” occur when the mixed layer is fresh and the pycnocline just below is due to the salinity gradient, while the temperature profile remains relatively homogeneous vertically (in the tropics: de Boyer Montégut et al., 2007; MacKinnon et al., 2016; Mignot et al., 2007) or when temperature does not impact density much in comparison to salinity (in the polar regions: MacKinnon et al., 2016; Pellichero et al., 2017; Peralta-Ferriz and Woodgate, 2015). Barrier layers are shown to influence the development of tropical cyclones (Rudzin et al., 2018). Most of the ocean primary production takes place in the surface mixed layer, which often coincides with the euphotic zone, and thus the MLD is also important for ecosystem functioning and the ocean uptake of carbon (Llort et al., 2019; Uchida et al., 2020).

Figure 1Observed MLD (m) in (a) winter and (b) summer (de Boyer Montégut, 2022). The winter (summer) MLD is the average over the months of January to March in the Northern (Southern) Hemisphere and July to September in the Southern (Northern) Hemisphere, respectively. Note the differing color scales for the two seasons.

The MLD is highly variable in space and time (de Boyer Montégut et al., 2004; Holte et al., 2017) and presents a strong seasonal cycle. Figure 1 shows a climatology of the mixed layer in winter and summer from observations (de Boyer Montégut, 2022). The winter deepening of the mixed layer at midlatitudes to high latitudes is highly heterogeneous in space, with well-known regions of large MLD related to deep-water formation (Labrador Sea and Greenland Sea for example; Schulze et al., 2016). At smaller spatial scales, mesoscale eddies and fronts have an impact on the MLD, as shown recently from observations (Gaube et al., 2019; Hausmann et al., 2017; Shroyer et al., 2020). The mixed layer is also highly variable on diurnal and synoptic timescales, as nighttime cooling or wind and wave events can drive significant deepening, while the mixed layer may quickly restratify during calm, warm periods through solar or dynamical mechanisms (Haney et al., 2012; Li et al., 2017; Li and Fox-Kemper, 2020). This lends a profoundly irreversible aspect to mixed-layer variability and connects high-frequency forcing to seasonal to decadal evolution of the layer by rectification.

For reliable projections of the future climate, the mixed-layer depth and its variability need to be well represented in numerical models (Semmler et al., 2021), but it is not the case presently (Belcher et al., 2012; Fox-Kemper et al., 2021; Li et al., 2019; Pan et al., 2023; Sallée et al., 2013). Analysis of the Coupled Model Intercomparison Project (CMIP) have revealed systematic biases: mixed layers were found to be too shallow in summer in CMIP5 (Huang et al., 2014; Sallée et al., 2013) and too deep in winter in CMIP5 and CMIP6 in some regions of deep-water formation (Fox-Kemper et al., 2021; Heuzé, 2021). In future projections using CMIP5 models, Alexander et al. (2018) find that the summer MLD decreases with strong anthropogenic forcing, a tendency that does not fit the trends observed during the historical period (Sallée et al., 2021) but is evident in the CMIP6 ensemble as well (Fox-Kemper et al., 2021).

A vast effort is currently underway to increase the spatial resolution of ocean climate models, in order to resolve mesoscale fronts and eddies, notably in the context of the High Resolution Model Intercomparison Project (HighResMIP; Haarsma et al., 2016). Positive impacts of resolving ocean eddies on the dynamics of western boundary currents, equatorial currents, and the Antarctic Circumpolar Current have been demonstrated in forced ocean models (Chassignet et al., 2020; Hecht and Hasumi, 2008) as well as in the recent HighResMIP coupled models (Beech et al., 2022; Grist et al., 2018; Roberts et al., 2018, 2020) for the representation of these currents and heat transports. However, Chassignet et al. (2020) did not find systematic improvements in salinity and temperature metrics with resolution. No assessment has been made yet regarding the representation of the MLD at the global scale. Evaluating mixed-layer characteristics in fully coupled models is a difficult task because the MLD depends not only on ocean characteristics (ocean circulation, parameterization of waves, turbulence, and vertical convection in each model) but also on the atmosphere and its variability (winds, air temperature, clouds, etc.); these differ considerably across CMIP models.

The Ocean Model Intercomparison Project (OMIP; Griffies et al., 2016) provides an ideal framework to evaluate the impact of ocean model resolution on the MLD, which is the purpose of our study. OMIP makes use of two common atmospheric and river runoff datasets to drive global ocean–sea-ice models. OMIP phase 1 (Griffies et al., 2009) was forced by the Coordinated Ocean-ice Reference Experiments phase-II (CORE-II) dataset (Large and Yeager, 2009), which is mainly derived from the National Centers for Environmental Prediction (NCEP) atmospheric reanalysis and covers a period of 62 years (1948–2009). The second phase (OMIP2) is based on the more recent JRA55-do forcing derived from the Japanese 55-year Reanalysis (Griffies et al., 2016; Tsujino et al., 2018) which covers the period 1958–2018. The increased temporal frequency and refined horizontal resolution makes JRA55-do more appropriate to force eddy-resolving ocean models. Chassignet et al. (2020) have used four pairs of OMIP2 simulations, integrated for one forcing cycle, to evaluate the impact of horizontal resolution on ocean kinetic energy, temperature, salinity, sea level, sea ice, meridional overturning circulation, and Drake Passage transport. The model pairs included a low-resolution (typically 1 grid spacing) and a high-resolution member (typically 1/10) with mostly comparable parameterization settings in each pair. A wide range of model variables have been assessed and compared with observations (temperature, salinity, sea surface height, eddy kinetic energy, sea ice), showing improvements at high resolution in some (but not all) of them.

We use the same experimental protocol as Chassignet et al. (2020), but additional models are included (up to 1/16 grid spacing, based on Iovino et al., 2016). The intercomparison of low-resolution models has shown that the MLD is very model-dependent (Tsujino et al., 2020). This provokes several questions, which we address here: is the MLD model-dependent in eddy-rich models? Are there improvements in the simulated MLD in regions of high mesoscale variability? The MLD is observed to be deeper in anticyclones (Gaube et al., 2019), although Hausmann et al. (2017) suggest that the net effect at large scales may be small. Is the MLD systematically deeper in eddy-rich models compared to non-eddying models?

The MLD is a nonlinear function of the density profile, and its statistics are not Gaussian (Johnson and Lyman, 2022), both of which create methodological difficulties for the evaluation of the models against observations as well as for model intercomparisons. This motivates an investigation of the influence of spatio-temporal sampling and MLD computation algorithms in Sect. 3, following the presentation of the models in Sect. 2. Section 4 presents the influence of resolution on the MLD biases at the global scale, and Sect. 5 focuses on water mass formation regions. Conclusions are presented in Sect. 6, where we also discuss an update to the OMIP–CMIP protocol regarding the diagnostic of the MLD that was proposed by Griffies et al. (2016).

2 Description of the model pairs

The spinup of the deep ocean occurs on centennial timescales (Griffies et al., 2009), which is why the OMIP protocol requires repeating the JRA55-do (years 1958 to 2018) forcing for six consecutive cycles (Tsujino et al., 2020). However such long simulations are usually too costly in computing time for the high-resolution models. Here, as in Chassignet et al. (2020), we consider only the first OMIP2 cycle, which is adequate for processes near the ocean surface. Only the last 30 years are analyzed to reduce spinup effects.

Table 1 summarizes some features of the model pairs relevant for our study. We use six model pairs, including the four model pairs described in Chassignet et al. (2020), whose naming convention (institution-ocean model name) we follow. Note that this naming convention differs from CMIP where a single “source name” is used for each model because some datasets used here are not published with the Earth System Grid Federation (ESGF). When relevant, the ESGF source name is also indicated in Table 1. FSU-HYCOM is a global configuration of the HYbrid Coordinate Ocean Model (HYCOM; Florida State University, FSU; Chassignet et al., 2003). Vertical mixing is parameterized by the K-profile parameterization (KPP; Large et al., 1994). The NCAR-POP model is based on the Parallel Ocean Program (POP; National Center for Atmospheric Research, NCAR; Smith et al., 2010). It is the global ocean component of the Community Earth System Model version 2 (CESM2; Danabasoglu et al., 2020), and the high-resolution version is documented by Chang et al. (2020). Vertical mixing is parameterized by KPP. There are two other parameterizations, targeted at mixed-layer dynamics, used in the low-resolution version of POP but not the high-resolution version: a parameterization of submesoscale eddy effects (Fox-Kemper et al., 2008, hereafter FFH; Fox-Kemper et al., 2011) and a parameterization of Langmuir turbulence (Li et al., 2016). AWI-FESOM is the Finite-Element/volumE Sea ice-Ocean Model (FESOM; Alfred Wegener Institute, AWI) version 1.4 (Wang et al., 2014). It differs from the other models considered here because of its unstructured grid. The low-resolution version has a resolution of 1 in most regions, up to 25 km in the polar seas, and 30 km at the Equator (0.127 million grid nodes on the horizontal); the high-resolution version has a grid scaled by the observed sea surface height variance, from 10 km in areas of high eddy activity to about 50 km elsewhere (1.3 million surface grid nodes; Sein et al., 2017). Maps of the grid resolution are shown in Fig. 1 of Chassignet et al. (2020). Vertical mixing is also represented by the KPP parameterization in this model. IAP-LICOM is a global configuration of the LASG/IAP Climate system Ocean Model (LICOM) version 3, developed by the State Key Laboratory of Numerical Modeling for Laboratory of Atmospheric Sciences and Geophysical Fluid Dynamics (LASG) of the Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (Li et al., 2020; Lin et al., 2020). Vertical mixing in the mixed layer for both momentum and tracers is computed by the scheme of Canuto et al. (2001, 2002) with an upper limit of 2 × 10−2m2 s−1. We do not include more details about these four model pairs because they are documented extensively by Chassignet et al. (2020).

ACCESS-MOM is the ocean component of the Australian Community Climate and Earth System Simulator (ACCESS) developed by a consortium of Australian universities and research institutes. It consists of the MOM5.1 ocean model (Modular Ocean Model; Griffies, 2012) coupled to the CICE5.1.2 sea-ice model (Community Ice Code; Hunke et al., 2015) at 1 and 0.1 nominal horizontal grid spacing. These are updates (see supporting information in Solodoch et al., 2022) of the configurations described by Kiss et al. (2020), and the 1 configuration is the ocean component of the ACCESS-CM2 climate model (Bi et al., 2020). The 0.1 grid is Mercator within 65 of the Equator, is tripolar north of 65 N, and has uniform meridional spacing south of 65 S. The 1 grid is similar but with a refinement to 1/3 meridional spacing within 10 of the Equator. The vertical coordinate is z, with 75 levels (1.1 m spacing at the surface; 198 m at depth) for the 0.1 configuration and 50 levels (2.3 m spacing at the surface; 220 m at depth) for the 1 configuration, with spacing increasing smoothly with depth to optimize resolution of baroclinic modes (Stewart et al., 2017). Both configurations use FFH and parameterize vertical mixing by KPP, bottom-enhanced internal tidal mixing (Simmons et al., 2004), and barotropic tidal mixing (Lee et al., 2006). Both configurations include background vertical diffusivity, a constant 10−6m2 s−1 at 0.1 grid spacing but increasing from 10−6m2 s−1 at the Equator to a constant 5×10-6m2 s−1 poleward of 20 (Jochum, 2009) at 1 grid spacing. The 1 configuration also has downslope mixing (Beckmann and Döscher, 1997; Campin and Goosse, 1999; Döscher and Beckmann, 2000).

CMCC-NEMO at low resolution is the ocean component of the CMCC Climate Model (CMCC-CM2; Cherchi et al., 2019), which is based on the Community Earth System Model (CESMv1.2), in which the original ocean component is replaced by the Nucleus for European Modelling of the Ocean (NEMO) version 3.6 (Madec and the NEMO team, 2016) that is coupled to the Community Ice Code (CICEv4.1; Hunke and Lipscomb, 2008) via the cpl7 coupling architecture. Discretization is performed on a tripolar quasi-isotropic grid with nominal 1 grid spacing, refined poleward following a Mercator projection and refined in latitude near the Equator (1/3). The vertical grid has 50 levels, with a thickness of 1 m at the surface, increasing to 400 m at depth. An iso-neutral lateral diffusivity, with a coefficient varying as the grid spacing, is used together with an eddy-induced velocity with a variable coefficient. Vertical mixing of momentum and tracers is performed by the TKE (turbulent kinetic energy) parameterization introduced by Blanke and Delecluse (1993), modified since then in NEMO to include Langmuir cells and the influence of surface wave breaking (Madec and the NEMO team, 2016). The high-resolution configuration version is largely based on its first implementation described in Iovino et al. (2016). It is based on NEMO version 3.6 coupled with the Louvain-la-Neuve sea-Ice Model (LIM) version 2 (Bouillon et al., 2009). It makes use of a non-uniform tripolar grid with a 1/16 horizontal grid spacing, which is 6.9 km at the Equator and increases poleward as the cosine of latitude (minimum grid spacing is  2 km around Victoria Island in the Arctic Ocean and a constant 3 km south of 60 S). There are 98 vertical levels, with a grid spacing of 1 m near the surface and 160 m at depth. Lateral mixing is parameterized by biharmonic viscosity and diffusion with a coefficient varying as the cube of the grid size. Vertical mixing uses the same TKE parameterization as in the low-resolution configuration.

Table 1Model characteristics (see text for more details). Consortia or institution names are as follows: Australian Community Climate and Earth System Simulator (ACCESS), Alfred Wegener Institute (AWI), Florida State University (FSU), Institute of Atmospheric Physics (IAP), National Center for Atmospheric Research (NCAR), and Centro Euro-Mediterraneo sui Cambiamenti Climatici (CMCC). Note that the MLD based on the online methods indicated in the last column is not used in this study. This column is intended to show the variety of methods used in model online MLD calculations which motivated us to recalculate MLD offline (see text). FGOALS: Flexible Global Ocean–Atmosphere–Land System.

Download Print Version | Download XLSX

As noted by Chassignet et al. (2020), the high-resolution models differ from their low-resolution counterparts by more than the horizontal grid spacing (and the parameterizations of lateral processes that depend on it). For CMCC-NEMO, the sea-ice models are not the same and they employ different bulk salinity, affecting the salt release from the sea ice into the ocean. For ACCESS-MOM, IAP-LICOM, and CMCC-NEMO, the vertical grid is finer in the high-resolution model. However, the parameterizations of vertical mixing are the same at low and high resolution, apart from the downslope mixing and latitudinal variation in background vertical diffusivity in ACCESS-MOM at low resolution. The MLD has been computed online in the models every time step, and monthly means have been saved. Unfortunately, as shown in Table 1, the computation has not been done following the OMIP protocol of Griffies et al. (2016); rather, each modeling group used a different method, and, in the case of NCAR, the computation is different in the low-resolution and high-resolution model. These different computation methods have an impact on the MLD, which can be avoided by recomputing the MLD from the monthly averages of temperature and salinity for the intercomparison of the models, but this raises non-trivial issues of sampling. These methodological questions are addressed in the following section.

3 Computing the mixed-layer depth from observations and models

3.1 Defining the depth of the mixed layer

Potential density is expected to be quasi-homogeneous in the mixed layer and increase downward in the stratified layers below. Beyond this rather vague statement, the concept of mixed layer is arbitrary (de Boyer Montégut et al., 2004, hereafter BM04), and, as a result, many definitions of the MLD have been proposed in the literature. The MLD may be computed using a threshold change in density or temperature (BM04), a threshold in density gradient (Dong et al., 2008), a maximum density gradient (Large et al., 1997), a maximum in the curvature of the density profile (Lorbacher et al., 2006), or a minimum of the relative variance (Huang et al., 2018) or based on energetic principles (Reichl et al., 2022). Before the Argo network started in the early 2000s, temperature thresholds were used extensively instead of density thresholds because temperature profile measurements were more numerous than salinity measurements. A threshold on density is more physically relevant because it is the density gradient that hinders the development of turbulence. A density threshold captures both temperature- and salinity-driven stratifications at the mixed-layer base and includes barrier layers.

However, in some regions there are density-compensated (vertical) gradients of temperature and salinity at the base of the mixed layer, where a density threshold may overestimate the MLD (BM04). Despite these “vertically compensated layers”, the potential density threshold method has been recommended by Griffies et al. (2016) to compute the MLD in OMIP and CMIP models, with a threshold value of 0.03 kg m−3. These authors advocate a uniform constant threshold, even though BM04 have pointed out that the density threshold should vary according to the sea surface temperature (SST). This is because the thermal expansion coefficient of water is smaller when the water is cold. For example, at a temperature of 0 C, a density change of 0.03 kg m−3 requires a temperature change of 0.6 C, which is a temperature variation greater than one would generally accept in a “well-mixed” layer. A spatially variable threshold may be appealing for observations but less so in the case of numerical models: a threshold dependent on each model's SST would make model intercomparisons more difficult. Also, the complexity of ensuring a spatially variable threshold was consistently applied in all models and observations is daunting. These considerations led to the choice of a fixed density threshold by Griffies et al. (2016).

MLD definitions other than the threshold method, such as gradient, curvature, or combinations of criteria (Holte and Talley, 2009), have not been proposed for OMIP because of their complexity or because of their possibly strong dependency on the vertical resolution in models with coarse vertical grids. Note however that the criterion of Large et al. (1997) has been used in some high-resolution models (Whitt et al., 2019) and that a simplified version of the Holte and Talley algorithm has recently been adapted to numerical models (Courtois et al., 2017). The potential-energy-based method proposed by Reichl et al. (2022) may have the advantage of being less sensitive to the model's vertical resolution than other methods, but more research is needed to understand how to choose a potential-energy threshold and whether it is possible to define one that would be relevant globally and for all seasons.

Figure 2Density profiles to illustrate the dependency of the MLD on the density threshold. Blue (red) dots are MLDs computed using a threshold of 0.01 (0.03) kg m−3, respectively. (a) An exponential stratified profile (continuous line) is mixed by wind- and wave-generated turbulence down to 50 m depth (dashed lines), in the absence of surface buoyancy input, generating a sharp density gradient. (b) Typical summer profile resulting from wind and wave mixing. (c) Typical winter profile resulting from buoyancy loss through surface fluxes, down to 200 m depth. (d) Spring profile with near-surface restratification.


Near the ocean surface, the density profile varies nonlinearly with depth, and the MLD is also a very nonlinear function of the density profile; these are two important reasons why the computation of the MLD can be strongly method-dependent. Figure 2 illustrates the sensitivity of MLD to the choice of the density threshold and how this sensitivity varies seasonally. When a stratified density profile is mixed by winds and waves, without any buoyancy input, the new density of the mixed layer is the depth average over that layer and a sharp gradient is created below (Fig. 2a). This is the mechanism that generates the so-called “transition layer”. In observations the thickness of this layer is controlled by mixing mechanisms such as shear instabilities and internal wave breaking (Johnston and Rudnick, 2009) and has been estimated to be 23 m on average at the global scale by Serazin et al. (2023). In models, the thickness of this transition layer needs interpretation including aspects of horizontal averaging of eddy features over grid cells as well (Danabasoglu et al., 2008). Because of this sharp density gradient just below the mixed layer in midlatitude summertime profiles, the MLD is not strongly dependent on the choice of a density threshold. Figure 2b shows that the MLDs computed with threshold densities of 0.01 and 0.03 kg m−3 are very similar (blue and red dots). Density profiles are different in winter: buoyancy is removed from the surface by atmospheric cooling, and deep mixed layers overlay a weaker stratification (note the different density scale in Fig. 2c and d compared to panels a and b). In Fig. 2c, the two density thresholds give distinct MLDs. In spring, warming at the surface creates thin stratified layers. Figure 2d illustrates a case where the restratified layer is captured by one density criterion but not the other, leading to very different MLDs (a difference on the order of 200 m).

3.2 Influence of the reference depth

Although the influence of the choice of density or temperature threshold has been extensively examined (e.g., BM04), another methodological choice has been overlooked, namely, the choice of the depth relative to which the threshold is computed. In most observation-based studies, a reference depth of 10 m is chosen to avoid capturing short events of shallow mixed layers that may occur during the day but will be mixed down again at night due to surface heat loss and convection (Brainerd and Gregg, 1995). Quoting BM04: “The MLD we want to estimate is the depth through which surface fluxes have been recently mixed and so integrated, recently meaning a timescale of at minimum a daily cycle, and no more than a few daily cycles.” The 10 m reference depth has been used by Holte and Talley (2009) and in general in all MLD estimates based on observations, except in the Arctic Ocean where mixed layers thinner than 10 m are found in summer under sea ice (Peralta-Ferriz and Woodgate, 2015). In models on the other hand, Griffies et al. (2016) recommend using the “surface” (that is, the top model level) as a reference depth. From Fig. 2d, it is easy to see that, in spring, changing the reference depth for the density threshold between 10 m and the surface may cause a large difference in MLD in models where the vertical resolution is sufficiently fine: if a shallow restratification that exceeds the density threshold is present above 10 m, the MLD computed with reference to the surface can be smaller than 10 m, while the MLD computed with reference to 10 m can reflect the deep winter mixed layer and be more than 100 m deeper. Furthermore, some models may include parameterizations of diurnal cycling (e.g., Large and Caron, 2015) that complicate the clear interpretation of the instantaneous sea surface properties but not those at 10 m. The question is whether such differences occur only locally and intermittently or whether they can affect the time-mean MLD computed in a climate model.

Figure 3Difference (m) between the MLD computed using two reference depths: 10 m and the top model level. The difference is shown for 2 months and two low-resolution OMIP2 models (climatology averaged over 1989–2018).

To assess the influence of the reference depth, we have recomputed the MLD using monthly datasets of temperature and salinity from two low-resolution OMIP2 models, NCAR-POP and CMCC-NEMO. A potential density threshold of 0.03 kg m−3 and two reference depths (the top model level and 10 m) are used. A monthly climatology of MLDs is then constructed by averaging the years 1989 to 2018. The MLD difference δref between the two reference depths is mapped in Fig. 3 for the months of May (when δref is the largest in the Northern Hemisphere) and September (when δref is the largest in the Southern Hemisphere). δref is not a noisy field but instead shows consistent patterns of differences in both models, which are probably related to the shape of the density profiles in different areas of the world ocean. The amplitude of δref is very dependent on the model vertical resolution near the surface (note the different color scale between the models in Fig. 3). The NCAR-POP model has its top level at 5 m depth, close enough to the 10 m reference: δref rarely exceeds 20 m for that model. In contrast, CMCC-NEMO has its top model level at 0.5 m, and δref is larger than 40 m over large areas of the ocean in the climatological average. Other models confirm this relationship between δref and the near-surface model resolution (not shown): the ACCESS-MOM model with a top level at 1.15 m has a δref similar to the CMCC model, while the IAP-LICOM model with a top model level at 5 m has a δref similar to the NCAR-POP model.

In conclusion, using the top model level as a reference depth instead of a reference at 10 m can make a significant and resolution-dependent difference (on the order of 40 m or more) in the monthly climatology of a model MLD when the top model level is close to the surface (about 1 m or less). Such differences cannot be ignored, being on the same order of magnitude as inter-model MLD differences found by Tsujino et al. (2020) in some regions and seasons. In this paper, we compute the MLD with a reference depth of 10 m, as advocated by BM04.

3.3 Nonlinearity of the MLD revealed by observations

Based on the above discussion, it is clear that in order to compare model MLDs with observational datasets it is necessary to use the same reference level (usually 10 m in profile-based MLD climatologies). However validating MLDs in models is further complicated by the fact that the MLD is a nonlinear function of the density profiles.

This nonlinearity manifests itself in observations when comparing estimates of MLD based on individual profiles with an MLD computed from climatological profiles; this effect has been documented by BM04. Let us denote a spatio-temporal average by 〈 〉, for example, the average over all profiles observed in a 2 box. BM04 compare the spatio-temporal average of MLDs computed from individual profiles 〈MLD(p)〉 with the MLD computed from the corresponding averaged density profile MLD(〈p〉) and show a global map of the relative difference (their Fig. 6). The profile-based average 〈MLD(p)〉 is generally deeper by about 25 %. BM04 provide a graphic explanation of this result by plotting observed profiles and the two MLD computations in a typical 2 box (their Fig. 7). The fact that 〈MLD(p)〉 tends to be systematically deeper than MLD(〈p〉) is due to the fact that near-surface stratifications, corresponding to shallow MLDs, are generally stronger than the deeper stratifications corresponding to deep MLDs. Although 〈MLD(p)〉 is the average of both shallow and deep MLDs computed on individual profiles, the averaged profile p has the imprint of the relatively strong near-surface stratification, and thus the MLD computed on this averaged profile is shallower than 〈MLD(p)〉). Opposite situations are found when the near-surface stratification is intermittent or weak enough so that its imprint on the averaged profile is smaller than the density threshold used to compute the MLD. Such situations are found for example in the subpolar gyres of the Northern Hemisphere in spring (BM04; their Fig. 6).

Figure 4Zonal-mean MLD (m) as a function of latitude in three different observational datasets (see text for details). The winter (summer) MLD is the average over the months of January to March in the Northern (Southern) Hemisphere and July to September in the Southern (Northern) Hemisphere, respectively. Holte dt is the MLD computed by the threshold method in the dataset of Holte et al. (2017).


We expect this consequence of MLD nonlinearity to show up in the comparison of a profile-based climatology of MLD (such as BM04) with the MLD computed from a gridded climatology of potential density from the In Situ Analysis System (ISAS; Gaillard et al., 2016). Zonal averages are plotted in Fig. 4 for two seasons. The MLD labeled “de Boyer” is updated from BM04 (de Boyer Montégut, 2022) and has been computed from individual profiles using a fixed density threshold of 0.03 kg m−3 relative to a depth of 10 m, for comparison with OMIP models. The same method has been used to compute the MLD from the monthly ISAS climatology of temperature and salinity. Despite the two products being based on a very similar set of hydrographic observations, both relying heavily on the Argo network, there is a systematic difference between the MLDs, with the zonal mean of the ISAS MLD (red curve in Fig. 4) being systematically smaller than de Boyer (black curve). This difference is in agreement with the findings of BM04 regarding the average of profile-based MLDs being very often deeper than the MLD of the average profile. It is also in agreement with the findings of Toyoda et al. (2017), who compared the MLD from reanalyses with both profile-based MLDs and MLDs computed from the World Ocean Atlas gridded climatology (their Fig. 1).

The choice of an observational dataset can influence model evaluations, especially for summer MLDs where differences between datasets are larger relative to the mean MLD. Huang et al. (2014) have evaluated the summer MLDs in 45 CMIP5 low-resolution models against the profile-based MLD climatology of BM04. For consistency all of the model MLDs were recomputed from monthly archives with a density threshold of 0.03 kg m−3 relative to a depth of 10 m. They find that the summer MLD is underestimated in the models. In the North Pacific and the North Atlantic, 80 % and 82 % (respectively) of the CMIP5 models have too shallow mixed layers compared with BM04 (their Figs. 6 and 7). Because the mixed layer computed from a gridded climatology is shallower than a profile-based one, using the ISAS climatology MLD would dramatically reverse their conclusion: only 24 % and 27 % of the CMIP5 models would underestimate the MLD relative to ISAS in the North Pacific and North Atlantic, respectively. One could argue that the latter result (an overestimation of the summer MLD by the models) is more relevant because the gridded climatology has a spatial resolution similar to the CMIP5 models and eddies are averaged out before the MLD is computed. However, one should note that the difference between the de Boyer and ISAS MLDs is smaller in the Southern Ocean relative to the mean values (averaged between 30 and 65 S, the summer MLD is 36 m for ISAS and 47 m for de Boyer) so that the conclusion of Huang et al. (2014) about CMIP5 models underestimating the Southern Ocean summer mixed layers is valid when using both observational datasets, contrary to the case of the North Pacific and the North Atlantic.

To illustrate how the choice of the MLD reference dataset impacts the interpretation of the model results, we have added in Fig. 4 the “threshold MLD” provided by Holte et al. (2017); it has been computed from individual profiles using the variable density threshold proposed by BM04 (a density jump equivalent to a temperature difference of 0.2 C at the profile location). This dataset has been used for the evaluation of MLDs in the CMIP models (Fox-Kemper et al., 2021). The threshold in the Holte MLD is larger than 0.03 kg m−3 where the SST is larger than 8–9 C, equatorward of 50 S and 50 N; thus the Holte MLD is larger than the de Boyer one (compare the dashed and the solid black curves). The influence of using a variable threshold there is to deepen the mixed layer by about 10 m, which is a relatively large difference in the tropics where the mixed layer is shallow. Southward of 50 S, where the SST is below 8–9 C, the Holte MLD is shallower than de Boyer because the density threshold is smaller than 0.03 kg m−3. Note that in Fig. 4, the Holte dataset is not comparable with the others north of 50 N because its zonal mean is not computed over the whole ocean area. Holte et al. (2017) provide MLDs binned into 1× 1 grid cells, with neither objective analysis nor addition of a climatology to fill up the cells where the number of profiles is not sufficient. The Holte dataset includes neither the Arctic, the Greenland Sea, the Baltic, nor any ice-covered region. This is also the case for the more recent GOSML dataset (Global Ocean Surface Mixed Layer; Johnson and Lyman, 2022). The non-global character of these two datasets makes them less suitable for comparison in zonal or large-scale mean with global models, compared with objectively analyzed datasets such as de Boyer Montégut (2022). Note that MLDs computed from climatologies have sometimes been preferred to profile-based MLDs for the evaluation of models; for example, Danabasoglu et al. (2014) and DuVivier et al. (2018) have compared the MLD in forced global models with an MLD computed from the World Ocean Atlas climatology.

3.4 Influence of the submonthly variability

Figures 3 and 4 show that the MLD depends on the method used to compute it. Although all the OMIP models used in this paper have computed the MLD online at every time step, the different methods and reference depths used (listed in Table 1) make it difficult to use these online MLDs for intercomparison purposes.

Recognizing this fact, many published model intercomparisons have not used the MLD provided by the modeling centers but rather have recomputed the MLDs from the monthly database of 3D temperature and salinity, in order to use a consistent MLD definition across models (Heuzé, 2021; Huang et al., 2014). Using a monthly archive means that the submonthly variability in the MLD, driven by the atmospheric synoptic variability and ocean eddies and fronts in high-resolution models, is not sampled. Furthermore, the nonlinearity of the MLD computation may lead to rectified effects, similar to those discussed above, resulting in systematic differences between the MLD computed at high frequency (time step or daily) vs. an MLD computed from monthly density profiles. Toyoda et al. (2017) have found that computation of MLD from monthly datasets of reanalyses underestimates the MLD by 10–20 m in early spring, compared to a computation based on daily datasets.

Figure 5Mixed-layer depth (m) (a, b) in the Gulf Stream region in FSU-HYCOM for the year 2018 and (c, d) in the Southern Ocean East Pacific sector in IAP-LICOM for the year 1998. (a, c) The MLD computed from one daily snapshot of temperature and salinity on 15 March. (b, d) The MLD computed using profiles averaged over the month of March.

Here we use daily output from two high-resolution models to quantify the effect of the submonthly variability. The MLD has been computed using the density threshold of 0.03 kg m−3 relative to 10 m depth, for both daily 3D fields (MLDd) and monthly averaged fields (MLDm), for 1 year. The year 2018 was available for FSU-HYCOM, and 1998 was available for IAP-LICOM. The first point to note is that the mesoscale imprint on the MLD is generally large. This creates a large spatial variability in MLD that is visible in both MLDd and MLDm (Fig. 5). Two regions are selected as examples: the Gulf Stream region in March 2018 (FSU-HYCOM, Fig. 5a and b) and the Southern Ocean in September 1998 (IAP-LICOM, Fig. 5c and d). The MLD computed from the monthly mean (Fig. 5b and d) is smoother than the MLD for a particular day of the month (Fig. 5a and c), as expected. However, most of the mesoscale spatial variability is still present in the MLD computed from the monthly mean. Some features cannot be captured at monthly resolution, such as the thin line of shallow mixed layers from 26 N, 65 W to 34 N, 57 W, which is potentially due to precipitation below an atmospheric front.

Figure 6Zonal mean of the difference between daily and monthly MLDs (MLDdMLDm; see text), in meters, averaged for each month. The computation is done in two high-resolution models: (a) FSU-HYCOM (year 2018) and (b) IAP-LICOM (year 1998).


Does the use of monthly density profiles lead to systematic differences in MLD due to the nonlinearity of the MLD computation? We have compared monthly maps of MLDd and MLDm (not shown). There are differences exceeding 100 m in some regions and some months, especially in the subpolar North Atlantic and the Southern Ocean in winter and spring. The zonal average of the difference MLDdMLDm is shown in Fig. 6 for each month. Overall, the difference is positive, consistent with the rectified effect mentioned above. A similar but slightly weaker effect was found by Toyoda et al. (2017) using a reanalysis with coarse (1) grid spacing (see their Fig. 2). The effect is very small in summer (about 2 m difference) for both models. It is more important in spring in the Northern Hemisphere, but in the zonal average, the difference is small relative to the average MLD and smaller than differences between the different observation-based datasets shown in Fig. 4.

For our model intercomparison, there is trade-off between using high-frequency online MLDs, but it is calculated with different methods vs. recomputing the MLD with a consistent method but a monthly database. Although the loss of submonthly variance can generate significant rectified effects (of order 100 m) in some months at some locations, the zonally averaged difference between MLDd and MLDm appears smaller than the differences caused by using different methods to compute MLD, justifying our choice to recompute MLD from monthly temperature and salinity fields whenever possible. We acknowledge that computing MLDs from monthly averaged T and S is not satisfying and that the only motivation to do so is to allow for the intercomparison of models for which the MLDs computed online are not comparable with each other.

Figure 7MLD biases (m) of models in winter, relative to de Boyer Montégut (2022). For each model, biases are shown for both (a, c, e) the low-resolution version and (b, d, f) the high-resolution version. The MLD has been recomputed from monthly archives, with the same potential density threshold and 10 m reference depth as the observations, except for CMCC-NEMO at high resolution where the MLD has been computed online using a reference level of 0.5 m (see text for details). The winter MLD is the average over the months of January to March in the Northern Hemisphere and July to September in the Southern Hemisphere.

4 MLD biases at the global scale

In this section, we compare the climatological MLD biases in the low-resolution and high-resolution models over the years 1989 to 2018. MLDs are recomputed from monthly archives, using a 0.03 kg m−3 potential density threshold and 10 m reference depth. However, we have kept the MLD computed online for the 1/16 CMCC model (an MLD referenced to the top model level instead of 10 m) because it has not been possible to recompute the MLD for that model. The MLD in the high-resolution models has been computed on each model native grid. For comparison with observations and with low-resolution models, the MLD has been coarsened by spatial averaging to a nominal 1 grid (coarsening by a factor of 10 for IAP-LICOM, ACCESS-MOM, and NCAR-POP; a factor of 12 for FSU-HYCOM; and a factor of 16 for CMCC-NEMO), before computing further spatial means or other statistics.

4.1 Winter mixed layers

Deep mixed layers observed at high latitudes at the end of the winter season receive much attention because they contribute to forming the water masses that enter the deep ocean and are isolated from the influence of the atmosphere over timescales of years to centuries. Low-resolution models have shown large differences in the amplitude and the location of MLD maxima. This has been demonstrated in the North Atlantic by Danabasoglu et al. (2014) with models forced by the CORE atmospheric forcing (similar to OMIP1) and confirmed by Tsujino et al. (2020) at the global scale with more recent models forced by both OMIP1 and OMIP2 forcings. The winter MLD biases are shown for the six model pairs used in this study (Fig. 7). For the low-resolution models, despite using newer versions of the models, the biases in the North Atlantic sector are large and comparable with the CORE intercomparison of Danabasoglu et al. (2014; see their Fig. 13). FSU-HYCOM overestimates the MLD in the Labrador, Irminger, and Greenland seas, as is the case for NCAR-POP but with weaker biases. AWI-FESOM MLD is too large in the Labrador Sea but less biased in the Nordic seas. CMCC-NEMO has more moderate biases in the Labrador and Irminger seas but too deep mixed layers in the Nordic seas. The IAP-LICOM and ACCESS-MOM models were not considered by Danabasoglu et al. (2014) and show rather shallow MLDs in the Labrador Sea, while the northeastern Atlantic and the Nordic seas have too deep mixed layers. In the Southern Ocean, NCAR-POP, CMCC-NEMO, and ACCESS-MOM show biases similar to the ones documented by Downes et al. (2015; their Fig. 1). FSU-HYCOM has very deep mixed layers close to Antarctica, a feature also noted by Downes et al. (2015). AWI-FESOM has lower biases in the Southern Ocean compared with the other models. IAP-LICOM seems to be an outlier, with overly shallow winter mixed layers in the Southern Ocean.

Tsujino et al. (2020) have used the same low-resolution models and some other ones to evaluate the impact of changing the atmospheric forcing from OMIP1 (similar to CORE) to OMIP2. The ensemble mean bias for winter MLD was found to be similar for OMIP1 and OMIP2 (their Fig. 11), suggesting that the biases arise from different model formulations and parameterizations rather than forcing. The different behavior of IAP-LICOM with too shallow mixed layers in the Southern Ocean and the Labrador Sea may be related to the vertical mixing scheme used in this model (see Table 1). This kind of scheme is usually employed in regional and fine resolution models, with a grid spacing of less than 1 km and time steps on the order of minutes. When it is applied in a climate model with coarse resolution and a large time step, the timescale of turbulent kinetic energy variability is not resolved (Reichl and Hallberg, 2018). Therefore, the related diffusivity may be significantly underestimated. Further tuning of its parameters may attenuate errors in MLDs.

MLD biases are generally reduced in the high-resolution models (Fig. 7). Low-resolution models tend to exhibit deep biases in some eddy-rich regions (Gulf Stream and Kuroshio extension, Agulhas basin) that seem reduced at high resolution, possibly pointing out a systematic effect of eddies on the restratification of the mixed layer. However, the amplitude and pattern of the changes are model-dependent. For example, the deep-mixed-layer bias in the Kuroshio extension region is clearly reduced in AWI-FESOM, ACCESS-MOM, and FSU-HYCOM at high resolution, but the other three model pairs have a lower bias at both resolutions in that region. In the Antarctic Circumpolar Current both deep and shallow biases are present at both resolutions. In the southeastern Pacific sector near 50 S, the shallow bias at low resolution in NCAR-POP and IAP-LICOM is replaced by a deep bias at high resolution; in contrast, the deep bias in ACCESS-MOM is reduced. This suggests that the MLD varies due to changes in the circulation and water masses between the different resolutions and not only due to the explicit representation of mesoscale eddies. One salient feature in Fig. 7 is the excessive convection that develops at high resolution in the Weddell Sea and in the Labrador Sea in the IAP-LICOM model. It is well known that although the Antarctic bottom water (AABW) is formed on the shelves and subsequently sinks to the bottom along the slope of the Antarctic continent, many climate models form AABW by open-ocean convection in the Weddell gyre and display unrealistic deep mixed layers there (Griffies et al., 2009; Heuzé, 2021). This loss of stratification in the Weddell gyre was thought to be dependent on eddy parameterizations in low-resolution models (Griffies et al., 2009). The fact that this problem appears in IAP-LICOM with resolved eddies will need further investigation.

Figure 8MLD biases (m) of models in summer, relative to de Boyer Montégut (2022). For each model, biases are shown for both (a, c, e) the low-resolution version and (b, d, f) the high-resolution version. The MLD has been recomputed from monthly archives, with the same potential density threshold and 10 m reference depth as the observations, except for CMCC-NEMO at high resolution where the MLD has been computed online using a reference level of 0.5 m (see text for details). The summer MLD is the average over the months of January to March in the Southern Hemisphere and July to September in the Northern Hemisphere.

4.2 Summer mixed layers

At midlatitudes and high latitudes, a strong near-surface stratification is created in summer due to positive surface heat flux, making the mixed layer shallow. Relatively shallow mixed layers dominate the tropical latitudes all year round (Fig. 4) so that the annual average of the mixed layer in the world ocean, computed from the de Boyer Montégut (2022) dataset, is only 53.4 m. Model biases are accordingly smaller in summer (Fig. 8) than in winter (Fig. 7). For low-resolution models, the main features that stand out in Fig. 8 are a tendency for too shallow summer mixed layers in the Southern Ocean and a band of too deep mixed layers around 5 N: this is similar to the multi-model bias shown – but not discussed – by Tsujino et al. (2020; their Fig. 12b). The Southern Ocean shallow bias is a long-standing problem that was pointed out by Griffies et al. (2009), who noted that in low-resolution models the bias was dependent on the lateral mixing scheme and the details of the implementation of the Gent–McWilliams parameterization in the surface layers. The bias is reduced but does not disappear in most of the high-resolution models, pointing to model deficiencies other than the parameterization of eddies. The shallow bias in the Southern Ocean is the largest in IAP-LICOM at both low and high resolution, suggesting again a role of the vertical mixing scheme. Other biases are more difficult to interpret because there are differences other than horizontal resolution between the two members of each model pair. In the case of NCAR-POP, the shallow bias is stronger in the high-resolution model; this may be due to the lack of a parameterization of Langmuir-induced mixing (a parameterization that was present in the low-resolution version; Table 1). In the case of CMCC-NEMO, the vertical mixing is exactly the same at both resolutions, but the mixed layer is too deep in the eddy-resolving simulation. The atmospheric forcing being the same, this deep bias must be due to changes in the circulation and the vertical profiles of temperature and salinity. A full analysis of such biases, similar to DuVivier et al. (2018) or Small et al. (2021), is beyond the scope of our paper. We hypothesize that changes in the stratification may result from the use of either different vertical resolutions or two distinct sea-ice models in the two members of the CMCC-NEMO pair, the latter resulting in large differences in sea-ice concentration.

The other region of strong biases is the latitude band near 5 N. As shown in Figs. 1 and 4, the observed MLD has a local maximum there between two minima, one at the Equator due to equatorial upwelling and one near 10 N, which marks the position of the Intertropical Convergence Zone (ITCZ) during the Northern Hemisphere summer. This local maximum at 5 N is overestimated to various degrees in the low-resolution models (Fig. 8). It is quite remarkable to note that this bias is reduced at high resolution in all the models. The amplitude of the bias at low resolution does not seem related in a straightforward manner to model parameters such as the vertical resolution (similar for ACCESS-MOM and CMCC-NEMO), the meridional resolution near the Equator (lower in IAP-LICOM compared to NCAR-POP), or the vertical mixing schemes. This region is characterized by a salinity control of the stratification at the base of the mixed layer (Helber et al., 2012; their Fig. 15c). An examination of profiles in the central Pacific suggests that the low-resolution models with the strongest deep bias underestimate the salinity increase at the mixed-layer base (not shown). The improvement in the MLD at high resolution may be due to both advection of salty water masses by the complex near-equatorial current system and to the lateral mixing by eddies and waves such as tropical instability waves. Similar challenges regarding modeling deep salinity maxima and their influence on the MLD, even at high resolution, have been pointed out by DuVivier et al. (2018), in the case of the “deep mixing band” of the Southern Ocean.

Figure 9Zonal-mean MLD biases and their dependency on model resolution. (a, b) Zonal-mean biases for two observation datasets (black curves) and for the range of low-resolution models (grey shading) and high-resolution models (red shading), for (a) the winter season and (b) the summer season. (c, d) The zonal mean of the difference between the MLD of the high-resolution models and the low-resolution models, for each model pair, for (c) winter and (d) summer. The dark-grey curve in panels (c) and (d) represents the zonal mean of the difference between the profile-based MLD climatology of de Boyer Montégut (2022) and the MLD computed from the ISAS climatology. For the CMCC model, the online MLD computation is used for both the high-resolution and the low-resolution model, for consistency.


4.3 Influence of model resolution on the zonal-mean biases

Figure 9a and b gives an overview of the zonal-mean MLD biases. The differences across models, represented by color shading, are larger than typical differences between observation-based datasets (ISAS and de Boyer are indicated in black lines, as examples). The range of biases is reduced in the eddy-resolving models compared with the low-resolution ones, but biases remain significant. We have demonstrated in the previous section that the spatially averaged MLD is usually larger when it is computed from density profiles sampled at high resolution (as in a profile-based climatology) compared with a computation on spatially averaged fields of density (a gridded climatology).

Could the sampling effect cause a systematic difference in MLDs in high- and low-resolution models? In that case, one would expect the difference between high-resolution and low-resolution MLDs to have some similarity with the difference between the profile-based MLD climatology of de Boyer and the MLD computed from the ISAS climatology. Figure 9c and d shows that it is not the case. In winter (Fig. 9c), the MLD differences between high- and low-resolution models are much larger than the differences between the two observation datasets. The MLDs at high resolution are shallower for most models and in most latitude bands (the high–low difference in Fig. 9 is negative, contrary to the positive difference between the two observational products). In summer (Fig. 9d), the high–low differences are also mostly negative north of 40 S, of opposite sign, and smaller than the difference between the observational datasets. South of 40 S, three models (CMCC-NEMO, IAP-LICOM, and ACCESS-MOM) have a deeper mixed layer at high resolution, while the other three models have shallower mixed layers. Thus, the MLD changes brought by the high resolution do not result from higher spatial sampling only. As mentioned above, the differences could have a dynamical origin (explicit representation of eddies at high resolution), but different parameterizations or different vertical grids for the two members of each model pair also contribute (Table 1).

Figure 10Seasonal cycle of MLD in three regions. (a) Map of the maximum MLD (de Boyer Montégut, 2022), indicating the three regions of deep-water formation (Labrador Sea, Irminger Sea, and Greenland Sea). (b–d) The seasonal cycle of the observed MLD (m) averaged over each region is plotted in black, and the high-resolution models are in color. The time axis is in months. The grey area is the range of the low-resolution model MLDs.

5 MLD biases in water mass formation regions

In this section, we consider in more detail the seasonal cycle and the maximum MLDs in regions of deep- and intermediate-water formation. Mesoscale and submesoscale dynamics play important roles in these regions, from preconditioning before convection events (Lherminier et al., 1999) to restratification after convection (Gelderloos et al., 2011). In the Northern Hemisphere, we define three regions, the Labrador Sea, Irminger Sea, and Greenland Sea, based on the map of maximum MLD in the de Boyer Montégut (2022) climatology (Fig. 10). In numerical simulations, convection may occur outside these areas, but here we focus on the way models (especially the high-resolution ones) have the capability to form water masses at the right location. Note that in these weakly stratified cold seas, the MLD based on the 0.03 kg m−3 density threshold is larger than MLDs based on other criteria (Courtois et al., 2017; BM04). Figure 10 compares the seasonal cycle of simulated MLDs with observations. In all regions, models capture the asymmetry of the seasonal cycle, with a slow deepening of the MLD in autumn and a rapid shallowing in spring. In the Greenland Sea, all the low-resolution models underestimate the winter MLD (grey shading). In fact, as shown in Fig. 7, these models have strongly positive MLD biases all along the warm and salty Norwegian Current, resulting in deeper mixed layers in the Norwegian Sea than in the Greenland Sea, contrary to observations. There is a marked reduction in these biases at high resolution (Figs. 7 and 10) for three models: NCAR-POP, FSU-HYCOM, and ACCESS-MOM. In the Labrador Sea, low-resolution models show a range of shallow and deep biases, but all the high-resolution models show deeper MLDs than the observations. In the Irminger Sea, almost all the low-resolution models tend to overestimate the MLD (Fig. 10, grey shading), while high-resolution models have biases of either sign. In these three regions where deep convection occurs, high resolution does not bring improvement uniformly across regions and models. There is no systematic reduction in uncertainties: in the Greenland Sea and the Irminger Sea, the spread of maximum MLD is larger across high-resolution models, compared to low-resolution ones.

Figure 11Seasonal cycle of the MLD in five regions. (a) Map of the maximum MLD (de Boyer Montégut, 2022), indicating the five regions of intermediate-water formation (subpolar East Atlantic, Gulf Stream, Kuroshio, South Pacific, and South Australia). (b–f) The seasonal cycle of the observed MLD (m) averaged over each region is plotted in black, and the high-resolution models are in color. The time axis is in months. The grey area is the range of the low-resolution model MLDs.

Let us consider regions of intermediate- or mode-water formation, outlined in Fig. 11. These regions are also regions of active mesoscale turbulence, especially the region of 18 C water formation south of the Gulf Stream, the Kuroshio region, and the band of deep MLDs on the northern flank of the Antarctic Circumpolar Current in the southeastern Pacific and South Australian sectors, where Subantarctic Mode Water (SAMW) are formed. We also consider the subpolar East Atlantic, a region where eddy activity is less intense but which plays an important part in the formation of North Atlantic subpolar mode waters and preconditioning of deep convection in the Labrador and Irminger seas (Garcia-Ibañez et al., 2015). The high-resolution models have been shown to represent well the eddy variability compared with satellite observations (Chassignet et al., 2020; Iovino et al., 2016). A substantial improvement is found in the seasonal cycle of MLD at high resolution in the three regions of the Northern Hemisphere. The spread of the models is much reduced compared to low-resolution models, suggesting that intermediate-water formation mechanisms could be much more robust at high resolution. For example, the excessively deep MLDs that occurred in the low-resolution models in the subpolar East Atlantic disappear at high resolution. On the other hand, there is no improvement at high resolution in the two Southern Ocean regions, which are part of the deep mixing band analyzed by DuVivier et al. (2018) using Argo observations and the NCAR-POP model at low and high resolution. Their simulations use the CORE (OMIP1) forcing, rather than the JRA55-do reanalysis of OMIP2, but overall they document biases very similar to ours, which is expected considering the robustness of winter MLD to the forcing (OMIP1 or OMIP2) demonstrated by Tsujino et al. (2020). DuVivier et al. (2018) find a relationship between the deep mixing band and the stratification and argue that a different representation of the subsurface salinity minimum is the cause of the difference between the two NCAR-POP models. The salinity minimum has a more realistic strength in the high-resolution POP, due to the explicit representation of eddies and a more realistic circulation (the Agulhas retroflexion, for example), but the salinity maximum is too deep, which causes the winter MLD to be overestimated (DuVivier et al., 2018). The complex interplay between circulation, eddies, and water mass distribution in the Southern Ocean is clearly a challenge for the models considered here, even at high resolution.

6 Conclusions

Increasing the horizontal resolution in ocean models often produces mixed results, rather than a universal improvement in the model solution (Chassignet et al., 2020). The same is true in our investigation of the MLD in the OMIP models. While the MLD biases are generally reduced by refining the horizontal grid, spurious deep winter mixed layers are found in the high-resolution models at some locations, for example in the Weddell Sea for IAP-LICOM and in the Labrador Sea for ACCESS-MOM. It is encouraging to see that when eddies are resolved at high resolution, the inter-model differences are reduced in some eddy-rich regions (Gulf Stream and Kuroshio). Unfortunately, it is the case neither in the Antarctic Circumpolar Current nor in the subpolar regions: increasing the spatial resolution does not improve the robustness of the model solution uniformly over the world ocean. MLD biases differ more across models than across resolution. The different biases come from differences in model formulations, vertical resolution, and parameterizations. Perhaps because of its nonlinear character, the MLD is a variable that is not well constrained by the imposed OMIP atmospheric forcing, contrary to the sea surface temperature (SST). The inter-model difference in SST in OMIP is relatively small (Tsujino et al., 2020) compared with SST differences across CMIP coupled models such as those displayed by Eyring et al. (2021; their Fig. 3.3). On the other hand, the inter-model differences in MLD in our model pairs are similar to those found in CMIP6 models (Heuzé, 2021). It is important to note that the mixed-layer heat content, an important variable for climate, is not well constrained by the SST alone and that the mixed-layer depth depends on internal ocean dynamics. As noted by Griffies et al. (2009) and Danabasoglu et al. (2014), OMIP experiments reveal the key influence of these ocean dynamics (and their representation in models) on major climate-relevant processes such as mixed-layer properties, water mass subduction, and meridional overturning.

The zonal average of the MLD difference between the high- and low-resolution models, displayed in Fig. 9, shows that the MLD is generally shallower in high-resolution models. This difference cannot be due to a rectified effect of the sampling, which would make the mixed layer deeper at high resolution, as pointed out by BM04 and demonstrated in Sect. 3. The zonal-mean global view suggests that the role of mesoscale eddies in restratifying the mixed layer is a dominant factor to explain the dependency of MLD on horizontal resolution and that this effect of eddies is not parameterized adequately in the low-resolution models. Note that by considering the spatial modulation of the MLD by eddies in observations, Gaube et al. (2019) reached the opposite conclusion. They argue that the MLD is globally deeper due to the presence of eddies. There is no contradiction here because evaluating the contrast between MLD inside and outside the eddies, as done by Gaube et al. (2019), does not reveal the dynamical role of eddies on the stratification. The simplicity of the zonal-mean view, however, may be misleading because it masks compensating biases of both signs. MLD biases depend not only on the explicit representation of eddies but also on the circulation in relation to water masses that control the stratification at the mixed-layer base. This is especially true in the Southern Ocean.

Regarding the importance of eddies for restratification, our results are still limited by the fact that even our high-resolution models do not represent submesoscale dynamics. Submesoscale motions are more efficient than mesoscales for restratifying the mixed layer (e.g., Fox-Kemper et al., 2008; Mensa et al., 2013) so that the shoaling of the MLD with resolution could be even larger if higher-resolution models were considered. However, our model intercomparison does not suggest a dominant role of the parameterization of submesoscales, as models in this ensemble with and without the Fox-Kemper parameterizations display similar biases. Thus, other drivers of errors such as numerics, mixing parameterizations, vertical resolution, or the impact of the different definitions for modeled MLD (monthly archive of horizontally gridded values) and observed MLD (from profiles localized in space and time) exceed the magnitude of impact of these parameterizations in many regions in these models.

The eddy-rich models considered here are promising tools to study mixed-layer statistics. Johnson and Lyman (2022) have recently published the GOSML dataset of MLD statistics based on Argo data. They find that the distribution of MLD is non-Gaussian, with large skewness and kurtosis that vary seasonally and spatially. The MLD variance also displays seasonal variations and depends on the MLD itself (regions with large MLDs have a large MLD variance). We have found similar properties of the MLD statistics in the high-resolution models, but the MLD variance computed from monthly averages is low compared with the observations. The MLD varies at high frequencies due to diurnal and synoptic atmospheric forcing (Whitt et al., 2019), which means that archives of MLD at high frequency will be needed to validate higher-order statistics of the mixed layer in eddy-rich models. The stratification at the base of the mixed layer is also an important feature that needs to be well simulated in models. Serazin et al. (2023) show that this density gradient, the upper ocean pycnocline, has a small thickness (a median value of 23 m globally). The increased vertical resolution of some eddy-rich models (98 levels for CMCC-NEMO, for example) could bring improvements in the representation of processes at the mixed-layer base.

An important conclusion of our work is that the protocol for computing the MLD in OMIP and CMIP (Griffies et al., 2016) needs revision. Firstly, the density jump should be computed with reference to a depth of 10 m, not the top model level, because the vertical grids differ in different models, and we have demonstrated that in some seasons a difference of less than 10 m in the reference depth can lead to more than 40 m difference in the MLD climatology. Secondly, all models should use the density threshold of 0.03 kg m−3 (or the corresponding buoyancy threshold; Griffies et al., 2016) to facilitate intercomparisons, and these models should be evaluated against observational products formulated consistently with this definition (e.g., de Boyer Montégut, 2022). Finally, this variable should be stored at high frequency (hourly or daily) and calculated online for averages, variances, or other statistics retained only occasionally. For models with very high vertical resolution near the surface (such as CMCC-NEMO and ACCESS-MOM) the computation of the MLD relative to the top model level would be an interesting additional diagnostic at hourly frequency, as it would resolve the diurnal cycle of the MLD and allow for the assessment of its rectified impact at longer timescales, but observations different than the profile-based ones used here would be needed to evaluate this metric.

Code and data availability

The dataset produced by de Boyer Montégut (2022) is published in the SEANOE repository ( The following OMIP model output data, published with the Earth System Grid Federation, have been used: ACCESS-OM2 (; Hayashida et al., 2021), CESM2 (; Danabasoglu, 2019), CMCC-CM2-SR5 (; Fogli et al., 2020), FGOALS-f3-H (; Lin, 2020), and FGOALS-f3-L (; Lin, 2019). The 0.1 ACCESS-MOM data are available at (Kiss et al., 2022). The data used to produce the figures and the corresponding Python notebook are archived on Zenodo at (Treguier et al., 2023).

Author contributions

AMT coordinated the conceptualization, carried out the investigation, and wrote the first draft of the manuscript. EPC, AB, XX, AMH, DI, AEK, YL, PL, HL, DS, QW, and SY provided model-based datasets and expertise for the interpretation of diagnostics. CdBM prepared the dataset used to validate the models. AMT, BFK, CdBM, EPC, AMH, DI, AEK, JLS, HL, GS, DS, QW, and SY participated in the conceptualization, editing, and reviewing of the manuscript.

Competing interests

At least one of the (co-)authors is a member of the editorial board of Geoscientific Model Development. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


We thank the present and past members of the Climate and Ocean – Variability, Predictability, and Change (CLIVAR) Ocean Model Development Panel who have designed and supported the Ocean Model Intercomparison Project. This work is a contribution to the MixED Layer hEterogeneitY (MEDLEY) project. MEDLEY has received funding from the Joint Programming Initiative (JPI) Climate and JPI Oceans programs under the 2019 joint call, managed by the French Agence Nationale de la Recherche (contract no. 19-JPOC-0001-01). The contributions of Baylor Fox-Kemper are supported by NOAA (grant no. NA19OAR4310366) and the National Science Foundation (NSF) (grant no. 2148945). The contribution of Yiwen Li, Pengfei Lin, and Hailong Liu are supported by the National Natural Science Foundation of China (NSFC) (grant no. 41931183). The contributions of Andrew E. Kiss are supported by the Australian Research Council (ARC) (grant no. LP160100073) and the Australian Government's Australian Antarctic Science Grant Program. The Consortium for Ocean-Sea Ice Modelling in Australia (COSIMA) (, last access: 2 September 2022) provides the ACCESS-OM2 model suite using resources of the National Computational Infrastructure, which is supported by the Australian Government.

Financial support

This research has been supported by the Agence Nationale de la Recherche (grant no. 19-JPOC-0001-01), NOAA (grant no. NA19OAR4310366), the National Science Foundation (NSF) (grant no. 2148945), the National Natural Science Foundation of China (NSFC) (grant no. 41931183), and the Australian Research Council (ARC) (grant no. LP160100073).

Review statement

This paper was edited by Riccardo Farneti and reviewed by A. J. George Nurser and Stephen M. Griffies.


Alexander, M. A., Scott, J. D., Friedland, K. D., Mills, K. E., Nye, J. A., Pershing, A. J., and Thomas, A. C.: Projected sea surface temperatures over the 21st century: Changes in the mean, variability and extremes for large marine ecosystem regions of Northern Oceans, Elem. Sci. Anthr., 6, 9,, 2018. 

Beckmann, A. and Döscher, R.: A Method for Improved Representation of Dense Water Spreading over Topography in Geopotential-Coordinate Models, J. Phys. Oceanogr., 27, 581–591,<0581:AMFIRO>2.0.CO;2, 1997. 

Beech, N., Rackow, T., Semmler, T., Danilov, S., Wang, Q., and Jung, T.: Long-term evolution of ocean eddy activity in a warming world, Nat. Clim. Change, 12, 910–917,, 2022. 

Belcher, S. E., Grant, A. L. M., Hanley, K. E., Fox-Kemper, B., Van Roekel, L., Sullivan, P. P., Large, W. G., Brown, A., Hines, A., Calvert, D., Rutgersson, A., Pettersson, H., Bidlot, J.-R., Janssen, P. A. E. M., and Polton, J. A.: A global perspective on Langmuir turbulence in the ocean surface boundary layer, Geophys. Res. Lett., 39, L18605,, 2012. 

Bi, D., Dix, M., Marsland, S., O'Farrell, S., Sullivan, A., Bodman, R., Law, R., Harman, I., Srbinovsky, J., Rashid, H. A., Dobrohotoff, P., Mackallah, C., Yan, H., Hirst, A., Savita, A., Dias, F. B., Woodhouse, M., Fiedler, R., and Heerdegen, A.: Configuration and spin-up of ACCESS-CM2, the new generation Australian Community Climate and Earth System Simulator Coupled Model, J. South. Hemisphere Earth Syst. Sci., 70, 225–251,, 2020. 

Blanke, B. and Delecluse, P.: Variability of the Tropical Atlantic Ocean Simulated by a General Circulation Model with Two Different Mixed-Layer Physics, J. Phys. Oceanogr., 23, 1363–1388,<1363:VOTTAO>2.0.CO;2, 1993. 

Bouillon, S., Morales Maqueda, M. Á., Legat, V., and Fichefet, T.: An elastic–viscous–plastic sea ice model formulated on Arakawa  B and C grids, Ocean Model., 27, 174–184,, 2009. 

Brainerd, K. E. and Gregg, M. C.: Surface mixed and mixing layer depths, Deep-Sea Res. Pt. I, 42, 1521–1543,, 1995. 

Campin, J.-M. and Goosse, H.: Parameterization of density-driven downsloping flow for a coarse-resolution ocean model in z-coordinate, Tellus A, 51, 412–430,, 1999. 

Canuto, V. M., Howard, A., Cheng, Y., and Dubovikov, M. S.: Ocean Turbulence. Part I: One-Point Closure Model—Momentum and Heat Vertical Diffusivities, J. Phys. Oceanogr., 31, 1413–1426,<1413:OTPIOP>2.0.CO;2, 2001. 

Canuto, V. M., Howard, A., Cheng, Y., and Dubovikov, M. S.: Ocean Turbulence. Part II: Vertical Diffusivities of Momentum, Heat, Salt, Mass, and Passive Scalars, J. Phys. Oceanogr., 32, 240–264,<0240:OTPIVD>2.0.CO;2, 2002. 

Chang, P., Zhang, S., Danabasoglu, G., Yeager, S. G., Fu, H., Wang, H., Castruccio, F. S., Chen, Y., Edwards, J., Fu, D., Jia, Y., Laurindo, L. C., Liu, X., Rosenbloom, N., Small, R. J., Xu, G., Zeng, Y., Zhang, Q., Bacmeister, J., Bailey, D. A., Duan, X., DuVivier, A. K., Li, D., Li, Y., Neale, R., Stössel, A., Wang, L., Zhuang, Y., Baker, A., Bates, S., Dennis, J., Diao, X., Gan, B., Gopal, A., Jia, D., Jing, Z., Ma, X., Saravanan, R., Strand, W. G., Tao, J., Yang, H., Wang, X., Wei, Z., and Wu, L.: An Unprecedented Set of High-Resolution Earth System Simulations for Understanding Multiscale Interactions in Climate Variability and Change, J. Adv. Model. Earth Sy., 12, e2020MS002298,, 2020. 

Chassignet, E. P., Smith, L. T., Halliwell, G. R., and Bleck, R.: North Atlantic Simulations with the Hybrid Coordinate Ocean Model (HYCOM): Impact of the Vertical Coordinate Choice, Reference Pressure, and Thermobaricity, J. Phys. Oceanogr., 33, 2504–2526,<2504:NASWTH>2.0.CO;2, 2003. 

Chassignet, E. P., Yeager, S. G., Fox-Kemper, B., Bozec, A., Castruccio, F., Danabasoglu, G., Horvat, C., Kim, W. M., Koldunov, N., Li, Y., Lin, P., Liu, H., Sein, D. V., Sidorenko, D., Wang, Q., and Xu, X.: Impact of horizontal resolution on global ocean–sea ice model simulations based on the experimental protocols of the Ocean Model Intercomparison Project phase 2 (OMIP-2), Geosci. Model Dev., 13, 4595–4637,, 2020. 

Cherchi, A., Fogli, P. G., Lovato, T., Peano, D., Iovino, D., Gualdi, S., Masina, S., Scoccimarro, E., Materia, S., Bellucci, A., and Navarra, A.: Global Mean Climate and Main Patterns of Variability in the CMCC-CM2 Coupled Model, J. Adv. Model. Earth Sy., 11, 185–209,, 2019. 

Courtois, P., Hu, X., Pennelly, C., Spence, P., and Myers, P. G.: Mixed layer depth calculation in deep convection regions in ocean numerical models, Ocean Model., 120, 60–78,, 2017. 

Danabasoglu, G.: NCAR CESM2 model output prepared for CMIP6 OMIP omip1, WCRP [data set],, 2019. 

Danabasoglu, G., Ferrari, R., and McWilliams, J. C.: Sensitivity of an Ocean General Circulation Model to a Parameterization of Near-Surface Eddy Fluxes, J. Climate, 21, 1192–1208,, 2008. 

Danabasoglu, G., Yeager, S. G., Bailey, D., Behrens, E., Bentsen, M., Bi, D., Biastoch, A., Böning, C., Bozec, A., Canuto, V. M., Cassou, C., Chassignet, E., Coward, A. C., Danilov, S., Diansky, N., Drange, H., Farneti, R., Fernandez, E., Fogli, P. G., Forget, G., Fujii, Y., Griffies, S. M., Gusev, A., Heimbach, P., Howard, A., Jung, T., Kelley, M., Large, W. G., Leboissetier, A., Lu, J., Madec, G., Marsland, S. J., Masina, S., Navarra, A., Nurser, A. J. G., Pirani, A., Mélia, D. S. Y., Samuels, B. L., Scheinert, M., Sidorenko, D., Treguier, A.-M., Tsujino, H., Uotila, P., Valcke, S., Voldoire, A., and Wang, Q.: North Atlantic simulations in Coordinated Ocean-ice Reference Experiments phase II (CORE-II). Part I: Mean states, Ocean Model., 73, 76–107,, 2014. 

Danabasoglu, G., Lamarque, J.-F., Bacmeister, J., Bailey, D. A., DuVivier, A. K., Edwards, J., Emmons, L. K., Fasullo, J., Garcia, R., Gettelman, A., Hannay, C., Holland, M. M., Large, W. G., Lauritzen, P. H., Lawrence, D. M., Lenaerts, J. T. M., Lindsay, K., Lipscomb, W. H., Mills, M. J., Neale, R., Oleson, K. W., Otto-Bliesner, B., Phillips, A. S., Sacks, W., Tilmes, S., van Kampenhout, L., Vertenstein, M., Bertini, A., Dennis, J., Deser, C., Fischer, C., Fox-Kemper, B., Kay, J. E., Kinnison, D., Kushner, P. J., Larson, V. E., Long, M. C., Mickelson, S., Moore, J. K., Nienhouse, E., Polvani, L., Rasch, P. J., and Strand, W. G.: The Community Earth System Model Version 2 (CESM2), J. Adv. Model. Earth Sy., 12, e2019MS001916,, 2020. 

D'Asaro, E. A.: Turbulence in the Upper-Ocean Mixed Layer, Annu. Rev. Mar. Sci., 6, 101–115,, 2014. 

de Boyer Montégut, C.: Mixed layer depth climatology computed with a density threshold criterion of 0.03 kg/m3 from 10 m depth value, SEANOE [data set],, 2022. 

de Boyer Montégut, C., Madec, G., Fisher, A. S., Lazar, A., and Iudicone, D.: Mixed layer depth over the global ocean: An examination of profile data and a profile-based climatology, J. Geophys. Res., 109, C12003,, 2004. 

de Boyer Montégut, C., Vialard, J., Shenoi, S. S. C., Shankar, D., Durand, F., Ethé, C., and Madec, G.: Simulated Seasonal and Interannual Variability of the Mixed Layer Heat Budget in the Northern Indian Ocean, J. Climate, 20, 3249–3268,, 2007. 

Dong, S., Sprintall, J., Gille, S. T., and Talley, L.: Southern Ocean mixed-layer depth from Argo float profiles, J. Geophys. Res.-Oceans, 113, C06013,, 2008. 

Döscher, R. and Beckmann, A.: Effects of a Bottom Boundary Layer Parameterization in a Coarse-Resolution Model of the North Atlantic Ocean, J. Atmos. Ocean. Tech., 17, 698–707,<0698:EOABBL>2.0.CO;2, 2000. 

Downes, S. M., Farneti, R., Uotila, P., Griffies, S. M., Marsland, S. J., Bailey, D., Behrens, E., Bentsen, M., Bi, D., Biastoch, A., Böning, C., Bozec, A., Canuto, V. M., Chassignet, E., Danabasoglu, G., Danilov, S., Diansky, N., Drange, H., Fogli, P. G., Gusev, A., Howard, A., Ilicak, M., Jung, T., Kelley, M., Large, W. G., Leboissetier, A., Long, M., Lu, J., Masina, S., Mishra, A., Navarra, A., Nurser, A. J. G., Patara, L., Samuels, B. L., Sidorenko, D., Spence, P., Tsujino, H., Wang, Q., and Yeager, S. G.: An assessment of Southern Ocean water masses and sea ice during 1988–2007 in a suite of interannual CORE-II simulations, Ocean Model., 94, 67–94,, 2015. 

DuVivier, A. K., Large, W. G., and Small, R. J.: Argo Observations of the Deep Mixing Band in the Southern Ocean: A Salinity Modeling Challenge, J. Geophys. Res.-Oceans, 123, 7599–7617,, 2018. 

Eyring, V., Gillet, N. P., Achuta Rao, K. M., Barimalala, R., Barreiro Parillo, M., Bellouin, N., Cassou, C., and Durack, P. J.: Human Influence on the Climate System, in: Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change, Cambridge University Press, Cambridge, United Kingdom and New York, 423–552, 2021. 

Fogli, P. G., Iovino, D., and Lovato, T.: CMCC CMCC-CM2-SR5 model output prepared for CMIP6 OMIP omip1, WCRP [data set],, 2020. 

Fox-Kemper, B., Ferrari, R., and Hallberg, R.: Parameterization of Mixed Layer Eddies. Part I: Theory and Diagnosis, J. Phys. Oceanogr., 38, 1145–1165,, 2008. 

Fox-Kemper, B., Danabasoglu, G., Ferrari, R., Griffies, S. M., Hallberg, R. W., Holland, M. M., Maltrud, M. E., Peacock, S., and Samuels, B. L.: Parameterization of mixed layer eddies. III: Implementation and impact in global ocean climate simulations, Model. Underst. Ocean Mesoscale Submesoscale, 39, 61–78,, 2011. 

Fox-Kemper, B., Hewitt, H. T., Xiao, C., Adalgeirsdottir, G., Drijfhout, S. S., Edwards, T. L., Golledge, N. R., Hemer, M., Kopp, R. E., Krinner, G., Mix, A., Notz, D., Nowicki, S., Nurhati, I. S., Ruiz, L., Sallee, J.-B., Slangen, A. B. A., and Yu, Y.: Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change, edited by: Masson-Delmotte, V., Zhai, P., Pirani, A., Connors, S. L., Péan, C., Berger, S., Caud, N., Chen, Y., Goldfarb, L., Gomis, M. I., Huang, M., Leitzell, K., Lonnoy, E., Matthews, J. B. R., Maycock, T. K., Waterfield, T., Yelekçi, O., Yu, R., and Zhou, B., Cambridge University Press, United Kingdom and New York, NY, USA, 1211–1362,, 2021. 

Gaillard, F., Reynaud, T., Thierry, V., Kolodziejczyk, N., and von Schuckmann, K.: In Situ–Based Reanalysis of the Global Ocean Temperature and Salinity with ISAS: Variability of the Heat Content and Steric Height, J. Climate, 29, 1305–1323,, 2016. 

García-Ibáñez, M. I., Pardo, P. C., Carracedo, L. I., Mercier, H., Lherminier, P., Ríos, A. F., and Pérez, F. F.: Structure, transports and transformations of the water masses in the Atlantic Subpolar Gyre, Prog. Oceanogr., 135, 18–36,, 2015. 

Gaube, P., J. McGillicuddy Jr., D., and Moulin, A. J.: Mesoscale Eddies Modulate Mixed Layer Depth Globally, Geophys. Res. Lett., 46, 1505–1512,, 2019. 

Gelderloos, R., Katsman, C. A., and Drijfhout, S. S.: Assessing the Roles of Three Eddy Types in Restratifying the Labrador Sea after Deep Convection, J. Phys. Oceanogr., 41, 2102–2119,, 2011. 

Griffies, S. M.: Elements of the Modular Ocean Model (MOM) (2012 release with updates), NOAA/Geophysical Fluid Dynamics Laboratory, Princeton, USA, 2012. 

Griffies, S. M., Biastoch, A., Böning, C., Bryan, F., Danabasoglu, G., Chassignet, E. P., England, M. H., Gerdes, R., Haak, H., Hallberg, R. W., Hazeleger, W., Jungclaus, J., Large, W. G., Madec, G., Pirani, A., Samuels, B. L., Scheinert, M., Gupta, A. S., Severijns, C. A., Simmons, H. L., Treguier, A. M., Winton, M., Yeager, S., and Yin, J.: Coordinated Ocean-ice Reference Experiments (COREs), Ocean Model., 26, 1–46,, 2009. 

Griffies, S. M., Danabasoglu, G., Durack, P. J., Adcroft, A. J., Balaji, V., Böning, C. W., Chassignet, E. P., Curchitser, E., Deshayes, J., Drange, H., Fox-Kemper, B., Gleckler, P. J., Gregory, J. M., Haak, H., Hallberg, R. W., Heimbach, P., Hewitt, H. T., Holland, D. M., Ilyina, T., Jungclaus, J. H., Komuro, Y., Krasting, J. P., Large, W. G., Marsland, S. J., Masina, S., McDougall, T. J., Nurser, A. J. G., Orr, J. C., Pirani, A., Qiao, F., Stouffer, R. J., Taylor, K. E., Treguier, A. M., Tsujino, H., Uotila, P., Valdivieso, M., Wang, Q., Winton, M., and Yeager, S. G.: OMIP contribution to CMIP6: experimental and diagnostic protocol for the physical component of the Ocean Model Intercomparison Project, Geosci. Model Dev., 9, 3231–3296,, 2016. 

Grist, J. P., Josey, S. A., New, A. L., Roberts, M., Koenigk, T., and Iovino, D.: Increasing Atlantic Ocean Heat Transport in the Latest Generation Coupled Ocean-Atmosphere Models: The Role of Air-Sea Interaction, J. Geophys. Res.-Oceans, 123, 8624–8637,, 2018. 

Haarsma, R. J., Roberts, M. J., Vidale, P. L., Senior, C. A., Bellucci, A., Bao, Q., Chang, P., Corti, S., Fučkar, N. S., Guemas, V., von Hardenberg, J., Hazeleger, W., Kodama, C., Koenigk, T., Leung, L. R., Lu, J., Luo, J.-J., Mao, J., Mizielinski, M. S., Mizuta, R., Nobre, P., Satoh, M., Scoccimarro, E., Semmler, T., Small, J., and von Storch, J.-S.: High Resolution Model Intercomparison Project (HighResMIP v1.0) for CMIP6, Geosci. Model Dev., 9, 4185–4208,, 2016. 

Haney, S., Bachman, S., Cooper, B., Kupper, S., McCaffrey, K., Van Roekel, L., Stevenson, S., Fox-Kemper, B., and Ferrari, R.: Hurricane wake restratification rates of one-, two- and three-dimensional processes, J. Mar. Res., 70, (last access: 1 July 2023), 2012. 

Hausmann, U., McGillicuddy Jr., D. J., and Marshall, J.: Observed mesoscale eddy signatures in Southern Ocean surface mixed-layer depth, J. Geophys. Res.-Oceans, 122, 617–635,, 2017. 

Hayashida, H., Kiss, A., Hogg, A., Hannah, N., Dias, F. B., Brassington, G., Chamberlain, M., Chapman, C., Dobrohotoff, P., Domingues, C. M., Duran, E., England, M., Fiedler, R., Griffies, S. M., Heerdegen, A., Heil, P., Holmes, R., Klocker, A., Marsland, S., Morrison, A., Munroe, J., Nikurashin, M., Oke, P. R., Pilo, G. S., Richet, O., Savita, A., Spence, P., Stewart, K. D., Ward, M., Wu, F., Zhang, X., Mackallah, C., and Druken, K.: CSIRO-COSIMA ACCESS-OM2 model output prepared for CMIP6 OMIP omip2, WCRP [data set],, 2021. 

Hecht, M. and Hasumi, H.: Ocean Modelling in the Eddying regime, Wiley, Washington D.C., USA, ISBN 978-0-87590-442-9, 2008. 

Helber, R. W., Kara, A. B., Richman, J. G., Carnes, M. R., Barron, C. N., Hurlburt, H. E., and Boyer, T.: Temperature versus salinity gradients below the ocean mixed layer, J. Geophys. Res.-Oceans, 117, C05006,, 2012. 

Heuzé, C.: Antarctic Bottom Water and North Atlantic Deep Water in CMIP6 models, Ocean Sci., 17, 59–90,, 2021. 

Holte, J. and Talley, L.: A New Algorithm for Finding Mixed Layer Depths with Applications to Argo Data and Subantarctic Mode Water Formation, J. Atmos. Ocean. Tech., 26, 1920–1939,, 2009. 

Holte, J., Talley, L. D., Gilson, J., and Roemmich, D.: An Argo mixed layer climatology and database, Geophys. Res. Lett., 44, 5618–5626,, 2017. 

Huang, C. J., Qiao, F., and Dai, D.: Evaluating CMIP5 simulations of mixed layer depth during summer, J. Geophys. Res.-Oceans, 119, 2568–2582,, 2014. 

Huang, P.-Q., Lu, Y.-Z., and Zhou, S.-Q.: An Objective Method for Determining Ocean Mixed Layer Depth with Applications to WOCE Data, J. Atmos. Ocean. Tech., 35, 441–458,, 2018. 

Hunke, E. and Lipscomb, W. H.: CICE: The Los Alamos Sea Ice Model Documentation and Software User's Manual Version 4.1, Los Alamos National Laboratory, Los Alamos, 2008. 

Hunke, E., Lipscomb, W. H., Turner, A. K., Jeffery, N., and and Elliott, S.: CICE: The Los Alamos Sea Ice Model Documentation and Software User's Manual Version 5.1, Los Alamos National Laboratory, Los Alamos, NM, USA, 2015. 

Iovino, D., Masina, S., Storto, A., Cipollone, A., and Stepanov, V. N.: A 1/16 eddying simulation of the global NEMO sea-ice–ocean system, Geosci. Model Dev., 9, 2665–2684,, 2016. 

Jochum, M.: Impact of latitudinal variations in vertical diffusivity on climate simulations, J. Geophys. Res.-Oceans, 114, C01010,, 2009. 

Johnson, G. C. and Lyman, J. M.: GOSML: A Global Ocean Surface Mixed Layer Statistical Monthly Climatology: Means, Percentiles, Skewness, and Kurtosis, J. Geophys. Res.-Oceans, 127, e2021JC018219,, 2022. 

Johnston, T. M. S. and Rudnick, D. L.: Observations of the Transition Layer, J. Phys. Oceanogr., 39, 780–797,, 2009. 

Kiss, A. E., Hogg, A. McC., Hannah, N., Boeira Dias, F., Brassington, G. B., Chamberlain, M. A., Chapman, C., Dobrohotoff, P., Domingues, C. M., Duran, E. R., England, M. H., Fiedler, R., Griffies, S. M., Heerdegen, A., Heil, P., Holmes, R. M., Klocker, A., Marsland, S. J., Morrison, A. K., Munroe, J., Nikurashin, M., Oke, P. R., Pilo, G. S., Richet, O., Savita, A., Spence, P., Stewart, K. D., Ward, M. L., Wu, F., and Zhang, X.: ACCESS-OM2 v1.0: a global ocean–sea ice model at three resolutions, Geosci. Model Dev., 13, 401–442,, 2020. 

Kiss, A. E., Hogg, A. McC., Hannah, N., Boeira Dias, F., Brassington, G. B., Chamberlain, M. A., Chapman, C., Dobrohotoff, P., Domingues, C. M., Duran, E. R., England, M. H., Fiedler, R., Griffies, S. M., Heerdegen, A., Heil, P., Holmes, R. M., Klocker, A., Marsland, S. J., Morrison, A. K., Munroe, J., Nikurashin, M., Oke, P. R., Pilo, G. S., Richet, O., Savita, A., Spence, P., Stewart, K. D., Ward, M. L., Wu, F., and Zhang, X.: ACCESS-OM2 0.1 degree global model output (interannual forcing simulation), NCI Data Catalogue [data set],, 2022. 

Large, W. G. and Caron, J. M.: Diurnal cycling of sea surface temperature, salinity, and current in the CESM coupled climate model, J. Geophys. Res.-Oceans, 120, 3711–3729,, 2015. 

Large, W. G. and Yeager, S. G.: The global climatology of an interannually varying air-sea flux data set, Clim. Dynam., 33, 341–364,, 2009. 

Large, W. G., McWilliams, J. C., and Doney, S. C.: Oceanic vertical mixing: A review and a model with a nonlocal boundary layer parameterization, Rev. Geophys., 32, 363–403,, 1994. 

Large, W. G., Danabasoglu, G., Doney, S. C., and McWilliams, J. C.: Sensitivity to Surface Forcing and Boundary Layer Mixing in a Global Ocean Model: Annual-Mean Climatology, J. Phys. Oceanogr., 27, 2418–2447,<2418:STSFAB>2.0.CO;2, 1997. 

Lee, H.-C., Rosati, A., and Spelman, M. J.: Barotropic tidal mixing effects in a coupled climate model: Oceanic conditions in the Northern Atlantic, Ocean Model., 11, 464–477,, 2006. 

Lherminier, P., Gascard, J.-C., and Quadfasel, D.: The Greenland Sea in Water 1993 and 1994: preconditioning for deep convection, Deep Sea Res. Part II Top. Stud. Oceanogr., 46, 1199–1235,, 1999. 

Li, L., Yu, Y., Tang, Y., Lin, P., Xie, J., Song, M., Dong, L., Zhou, T., Liu, L., Wang, L., Pu, Y., Chen, X., Chen, L., Xie, Z., Liu, H., Zhang, L., Huang, X., Feng, T., Zheng, W., Xia, K., Liu, H., Liu, J., Wang, Y., Wang, L., Jia, B., Xie, F., Wang, B., Zhao, S., Yu, Z., Zhao, B., and Wei, J.: The Flexible Global Ocean-Atmosphere-Land System Model Grid-Point Version 3 (FGOALS-g3): Description and Evaluation, J. Adv. Model. Earth Sy., 12, e2019MS002012,, 2020. 

Li, Q. and Fox-Kemper, B.: Anisotropy of Langmuir turbulence and the Langmuir-enhanced mixed layer entrainment, Phys Rev Fluids, 5, 013803,, 2020. 

Li, Q., Webb, A., Fox-Kemper, B., Craig, A., Danabasoglu, G., Large, W. G., and Vertenstein, M.: Langmuir mixing effects on global climate: WAVEWATCH III in CESM, Waves Coast. Reg. Glob. Process., 103, 145–160,, 2016. 

Li, Q., Fox-Kemper, B., Breivik, Ø., and Webb, A.: Statistical models of global Langmuir mixing, Ocean Model., 113, 95–114,, 2017. 

Li, Q., Reichl, B. G., Fox-Kemper, B., Adcroft, A. J., Belcher, S. E., Danabasoglu, G., Grant, A. L. M., Griffies, S. M., Hallberg, R., Hara, T., Harcourt, R. R., Kukulka, T., Large, W. G., McWilliams, J. C., Pearson, B., Sullivan, P. P., Van Roekel, L., Wang, P., and Zheng, Z.: Comparing Ocean Surface Boundary Vertical Mixing Schemes Including Langmuir Turbulence, J. Adv. Model. Earth Sy., 11, 3545–3592,, 2019. 

Lin, P.: CAS FGOALS-f3-L model output prepared for CMIP6 OMIP omip1, WCRP [data set],, 2019. 

Lin, P.: CAS FGOALS-f3-H model output prepared for CMIP6 OMIP omip2, WCRP [data set],, 2020. 

Lin, P., Yu, Z., Liu, H., Yu, Y., Li, Y., Jiang, J., Xue, W., Chen, K., Yang, Q., Zhao, B., Wei, J., Ding, M., Sun, Z., Wang, Y., Meng, Y., Zheng, W., and Ma, J.: LICOM Model Datasets for the CMIP6 Ocean Model Intercomparison Project, Adv. Atmos. Sci., 37, 239–249,, 2020. 

Llort, J., Lévy, M., Sallée, J. B., and Tagliabue, A.: Nonmonotonic Response of Primary Production and Export to Changes in Mixed-Layer Depth in the Southern Ocean, Geophys. Res. Lett., 46, 3368–3377,, 2019. 

Lorbacher, K., Dommenget, D., Niiler, P. P., and Köhl, A.: Ocean mixed layer depth: A subsurface proxy of ocean-atmosphere variability, J. Geophys. Res., 111, C07010,, 2006. 

MacKinnon, J. A., Nash, J. D., Alford, M. H., Lucas, A. J., Mickett, J. B., Shroyer, E. L., Waterhouse, A. F., Tandon, A., Sengupta, D., Mahadevan, A., Ravichandran, M., Pinkel, R., Rudnick, D. L., Whalen, C. B., Alberty, M. S., Lekha, J. S., Fine, E. C., Chaudhuri, D., and Wagner, G. L.: A Tale of Two Spicy Seas, Oceanography, 29, 50–61,, 2016. 

Madec, G. and the NEMO team: NEMO reference manual 3_6_STABLE, Institut Pierre-Simon Laplace (IPSL), Paris, France, France, 2016. 

Mensa, J. A., Garraffo, Z., Griffa, A., Özgökmen, T. M., Haza, A., and Veneziani, M.: Seasonality of the submesoscale dynamics in the Gulf Stream region, Ocean Dyn., 63, 923–941,, 2013. 

Mignot, J., de Boyer Montégut, C., Lazar, A., and Cravatte, S.: Control of salinity on the mixed layer depth in the world ocean: 2. Tropical areas, J. Geophys. Res.-Oceans, 112, C06011,, 2007. 

Pan, R., Shu, Q., Wang, Q., Wang, S., Song, Z., He, Y., and Qiao, F.: Future Arctic climate change in CMIP6 strikingly intensified by NEMO-family climate models, Geophys. Res. Lett., 50, e2022GL102077,, 2023. 

Pellichero, V., Sallée, J.-B., Schmidtko, S., Roquet, F., and Charrassin, J.-B.: The ocean mixed layer under Southern Ocean sea-ice: Seasonal cycle and forcing, J. Geophys. Res.-Oceans, 122, 1608–1633,, 2017. 

Peralta-Ferriz, C. and Woodgate, R. A.: Seasonal and interannual variability of pan-Arctic surface mixed layer properties from 1979 to 2012 from hydrographic data, and the dominance of stratification for multiyear mixed layer depth shoaling, Prog. Oceanogr., 134, 19–53,, 2015. 

Reichl, B. G. and Hallberg, R.: A simplified energetics based planetary boundary layer (ePBL) approach for ocean climate simulations, Ocean Model., 132, 112–129,, 2018. 

Reichl, B. G., Adcroft, A., Griffies, S. M., and Hallberg, R.: A Potential Energy Analysis of Ocean Surface Mixed Layers, J. Geophys. Res.-Oceans, 127, e2021JC018140,, 2022. 

Roberts, M. J., Vidale, P. L., Senior, C., Hewitt, H. T., Bates, C., Berthou, S., Chang, P., Christensen, H. M., Danilov, S., Demory, M.-E., Griffies, S. M., Haarsma, R., Jung, T., Martin, G., Minobe, S., Ringler, T., Satoh, M., Schiemann, R., Scoccimarro, E., Stephens, G., and Wehner, M. F.: The benefits of global high-resolution for climate simulation: process-understanding and the enabling of stakeholder decisions at the regional scale, B. Am. Meteorol. Soc., 99, 2341–2359,, 2018. 

Roberts, M. J., Jackson, L. C., Roberts, C. D., Meccia, V., Docquier, D., Koenigk, T., Ortega, P., Moreno-Chamarro, E., Bellucci, A., Coward, A., Drijfhout, S., Exarchou, E., Gutjahr, O., Hewitt, H., Iovino, D., Lohmann, K., Putrasahan, D., Schiemann, R., Seddon, J., Terray, L., Xu, X., Zhang, Q., Chang, P., Yeager, S. G., Castruccio, F. S., Zhang, S., and Wu, L.: Sensitivity of the Atlantic Meridional Overturning Circulation to Model Resolution in CMIP6 HighResMIP Simulations and Implications for Future Changes, J. Adv. Model. Earth Sy., 12, e2019MS002014,, 2020. 

Rudzin, J. E., Shay, L. K., and Johns, W. E.: The Influence of the Barrier Layer on SST Response during Tropical Cyclone Wind Forcing Using Idealized Experiments, J. Phys. Oceanogr., 48, 1471–1478,, 2018. 

Sallée, J.-B., Shuckburgh, E., Bruneau, N., Meijers, A. J. S., Bracegirdle, T. J., and Wang, Z.: Assessment of Southern Ocean mixed-layer depths in CMIP5 models: Historical bias and forcing response, J. Geophys. Res.-Oceans, 118, 1845–1862,, 2013. 

Sallée, J.-B., Pellichero, V., Akhoudas, C., Pauthenet, E., Vignes, L., Schmidtko, S., Garabato, A. N., Sutherland, P., and Kuusela, M.: Summertime increases in upper-ocean stratification and mixed-layer depth, Nature, 591, 592–598,, 2021. 

Schulze, L. M., Pickart, R. S., and Moore, G. W. K.: Atmospheric forcing during active convection in the Labrador Sea and its impact on mixed-layer depth, J. Geophys. Res.-Oceans, 121, 6978–6992,, 2016. 

Sein, D. V., Koldunov, N. V., Danilov, S., Wang, Q., Sidorenko, D., Fast, I., Rackow, T., Cabos, W., and Jung, T.: Ocean Modeling on a Mesh With Resolution Following the Local Rossby Radius, J. Adv. Model. Earth Sy., 9, 2601–2614,, 2017. 

Semmler, T., Jungclaus, J., Danek, C., Goessling, H. F., Koldunov, N. V., Rackow, T., and Sidorenko, D.: Ocean Model Formulation Influences Transient Climate Response, J. Geophys. Res.-Oceans, 126, e2021JC017633,, 2021. 

Serazin, G., Treguier, A. M., and De Boyer Montégut, C.: A seasonal climatology of the upper ocean pycnocline, Front. Mar. Sci., in press, 2023. 

Shroyer, E. L., Gordon, A. L., Jaeger, G. S., Freilich, M., Waterhouse, A. F., Farrar, J. T., Sarma, V. V. S. S., Venkatesan, R., Weller, R. A., Moum, J. N., and Mahadevan, A.: Upper layer thermohaline structure of the Bay of Bengal during the 2013 northeast monsoon, Deep-Sea Res. Pt. II, 172, 104630,, 2020. 

Simmons, H. L., Jayne, S. R., Laurent, L. C. St., and Weaver, A. J.: Tidally driven mixing in a numerical model of the ocean general circulation, Ocean Model., 6, 245–263,, 2004. 

Small, R. J., DuVivier, A. K., Whitt, D. B., Long, M. C., Grooms, I., and Large, W. G.: On the control of subantarctic stratification by the ocean circulation, Clim. Dynam., 56, 299–327,, 2021. 

Smith, R., Jones, P. W., Briegleb, P. A., Bryan, O., Danabasoglu, G., Dennis, M. L., Dukowicz, J. K., Eden, C., Fox-Kemper, B., Gent, R. van, Hecht, M., Jayne, S. R., Jochum, M., Large, G., Lindsay, K., Maltrud, M. E., Norton, J., Peacock, L., Vertenstein, M., and Yeager, S. G.: The Parallel Ocean Program (POP) reference manual: Ocean component of the Community Climate System Model (CCSM), UCAR/NCAR, Boulder, USA, 2010. 

Solodoch, A., Stewart, A. L., Hogg, A. McC., Morrison, A. K., Kiss, A. E., Thompson, A. F., Purkey, S. G., and Cimoli, L.: How Does Antarctic Bottom Water Cross the Southern Ocean?, Geophys. Res. Lett., 49, e2021GL097211,, 2022. 

Stewart, K. D., Hogg, A. McC., Griffies, S. M., Heerdegen, A. P., Ward, M. L., Spence, P., and England, M. H.: Vertical resolution of baroclinic modes in global ocean models, Ocean Model., 113, 50–65,, 2017. 

Toyoda, T., Fujii, Y., Kuragano, T., Kamachi, M., Ishikawa, Y., Masuda, S., Sato, K., Awaji, T., Hernandez, F., Ferry, N., Guinehut, S., Martin, M. J., Peterson, K. A., Good, S. A., Valdivieso, M., Haines, K., Storto, A., Masina, S., Köhl, A., Zuo, H., Balmaseda, M., Yin, Y., Shi, L., Alves, O., Smith, G., Chang, Y.-S., Vernieres, G., Wang, X., Forget, G., Heimbach, P., Wang, O., Fukumori, I., and Lee, T.: Intercomparison and validation of the mixed layer depth fields of global ocean syntheses, Clim. Dynam., 49, 753–773,, 2017. 

Treguier, A. M., Chassignet, E., Hogg, A. McC., Kiss, A. E., Li, Y., Lin, P., Liu, H., Iovino, D., Sidorenko, D., Wang, Q., and Yeager, S.: The Mixed Layer Depth in the Ocean Model Intercomparison Project (OMIP): Impact of Resolving Mesoscale Eddies: supporting data (1.0), Zenodo [data set],, 2023.  

Tsujino, H., Urakawa, S., Nakano, H., Small, R. J., Kim, W. M., Yeager, S. G., Danabasoglu, G., Suzuki, T., Bamber, J. L., Bentsen, M., Böning, C. W., Bozec, A., Chassignet, E. P., Curchitser, E., Boeira Dias, F., Durack, P. J., Griffies, S. M., Harada, Y., Ilicak, M., Josey, S. A., Kobayashi, C., Kobayashi, S., Komuro, Y., Large, W. G., Le Sommer, J., Marsland, S. J., Masina, S., Scheinert, M., Tomita, H., Valdivieso, M., and Yamazaki, D.: JRA-55 based surface dataset for driving ocean–sea-ice models (JRA55-do), Ocean Model., 130, 79–139,, 2018. 

Tsujino, H., Urakawa, L. S., Griffies, S. M., Danabasoglu, G., Adcroft, A. J., Amaral, A. E., Arsouze, T., Bentsen, M., Bernardello, R., Böning, C. W., Bozec, A., Chassignet, E. P., Danilov, S., Dussin, R., Exarchou, E., Fogli, P. G., Fox-Kemper, B., Guo, C., Ilicak, M., Iovino, D., Kim, W. M., Koldunov, N., Lapin, V., Li, Y., Lin, P., Lindsay, K., Liu, H., Long, M. C., Komuro, Y., Marsland, S. J., Masina, S., Nummelin, A., Rieck, J. K., Ruprich-Robert, Y., Scheinert, M., Sicardi, V., Sidorenko, D., Suzuki, T., Tatebe, H., Wang, Q., Yeager, S. G., and Yu, Z.: Evaluation of global ocean–sea-ice model simulations based on the experimental protocols of the Ocean Model Intercomparison Project phase 2 (OMIP-2), Geosci. Model Dev., 13, 3643–3708,, 2020. 

Uchida, T., Balwada, D., P. Abernathey, R., A. McKinley, G., K. Smith, S., and Lévy, M.: Vertical eddy iron fluxes support primary production in the open Southern Ocean, Nat. Commun., 11, 1125,, 2020. 

Wang, Q., Danilov, S., Sidorenko, D., Timmermann, R., Wekerle, C., Wang, X., Jung, T., and Schröter, J.: The Finite Element Sea Ice-Ocean Model (FESOM) v.1.4: formulation of an ocean general circulation model, Geosci. Model Dev., 7, 663–693,, 2014. 

Whitt, D. B., Nicholson, S. A., and Carranza, M. M.: Global Impacts of Subseasonal (< 60 Day) Wind Variability on Ocean Surface Stress, Buoyancy Flux, and Mixed Layer Depth, J. Geophys. Res.-Oceans, 124, 8798–8831,, 2019. 

Short summary
The ocean mixed layer is the interface between the ocean interior and the atmosphere and plays a key role in climate variability. We evaluate the performance of the new generation of ocean models for climate studies, designed to resolve ocean eddies, which are the largest source of ocean variability and modulate the mixed-layer properties. We find that the mixed-layer depth is better represented in eddy-rich models but, unfortunately, not uniformly across the globe and not in all models.