ECCO version 4: an integrated framework for non-linear inverse modeling and global ocean state estimation

Abstract. This paper presents the ECCO v4 non-linear inverse modeling framework and its baseline solution for the evolving ocean state over the period 1992–2011. Both components are publicly available and subjected to regular, automated regression tests. The modeling framework includes sets of global conformal grids, a global model setup, implementations of data constraints and control parameters, an interface to algorithmic differentiation, as well as a grid-independent, fully capable Matlab toolbox. The baseline ECCO v4 solution is a dynamically consistent ocean state estimate without unidentified sources of heat and buoyancy, which any interested user will be able to reproduce accurately. The solution is an acceptable fit to most data and has been found to be physically plausible in many respects, as documented here and in related publications. Users are being provided with capabilities to assess model–data misfits for themselves. The synergy between modeling and data synthesis is asserted through the joint presentation of the modeling framework and the state estimate. In particular, the inverse estimate of parameterized physics was instrumental in improving the fit to the observed hydrography, and becomes an integral part of the ocean model setup available for general use. More generally, a first assessment of the relative importance of external, parametric and structural model errors is presented. Parametric and external model uncertainties appear to be of comparable importance and dominate over structural model uncertainty. The results generally underline the importance of including turbulent transport parameters in the inverse problem.

General circulation models implement the primitive equations, which extend far beyond the physics and numerics used in common inverse box models. On the one hand, they readily provide a versatile tool for dynamical interpolation of virtually all types of observations. On the other hand, numerical modeling has to be regarded as an integral 10 part of non-linear inverse modeling, and as a primary responsibility of groups carrying ocean state estimation. Indeed, the quality of the model and the adequacy of its settings determine the physical consistency of ocean state estimates. Hence the state estimation group at MIT has become a main contributor of MITgcm code including, but not limited to the estimation framework implementation. Furthermore, the develop- 15 ment of the new ECCO version 4 (ECCO v4) estimate described here started with an extensive revisit of MITgcm settings.
These considerations prompt the joint depiction of forward model setup and estimation framework developments as part of ECCO v4, and of the baseline solution of the non-linear inverse model. The overarching goal, which is essential to the oceano-20 graphic community, is the unification of the two pillars of science, namely observations (emphasis here is on data of global coverage) and theory (of which general circulation models are a vehicle). Thus, the synergy between data analysis and modeling is a guiding thread of this paper.
As a complement to this paper, and a number of associated publications, the setup Introduction These considerations led to the design of the Lat-Lon-Cap grid (LLC; right panel in Fig. 1) such that 1. the grid reverts to a simple LL sector between 70 • S and 57 • N; 2. grid vertices are located over land; 3. grid heterogeneities remain acceptable at 1 48 • resolution. 5 At mid-latitudes, within the LL sector, the LLC grid is locally isotropic with grid spacing varying in cos(ϕ), where ϕ denotes latitude. At low latitudes LLC is refined in the meridional direction to better resolve the tropical system of zonal currents 1 . Grid scaling properties are shown in more detail in Figs. 2 and 3. The LL sector mesh derives from a simple analytic formulation based of geographic latitude and longitude. Users who 10 may not be particularly invested in high latitude research may skip over the rest of this section, and simply extract the LL sector (70 • S and 57 • N) out of global LLC fields (see Sect. "Code availability"). Poleward of 57 • N, LLC is topologically equivalent to CS minus one cube face (Fig. 1). The vertices of the Arctic cap are placed at a latitude of 67 • N and in a specific orien- 15 tation such that they all fall over land (Fig. 2,middle panel). For any given Arctic face dimension, LLC has the added advantage of an increased resolution in the Arctic as compared with CS, which has vertices at 45 • N. The LLC grid is specifically designed for ocean simulations, whereas the CS grid is more suitable for atmospheric simulations. Between the LL sector and the Arctic cap, the grid makes a gradual, conformal 20 transition that is evident in Fig. 3 between 57 and 67 • N. To the South of 70 • S, LLC is topologically equivalent to a pillow case that would have two vertices in each hemisphere (Fig. 2, right panel). The vertices are again placed over land at a latitude of 80 • S and away from the Ross and Weddell ice shelves. Details of the grid generation method are reported in Appendix A.

48
• global grid has 17 280 longitudinal grid cells. It is labelled "LLC4320" since the common face dimension is 4320. This grid size was chosen to maximize the • , 1 • , 2 • or 4 • . This is a desirable property for a long-term project such as ECCO, in which spatio-temporal resolution is expected to increase in the fu- 15 ture as computing capability and observational data base will keep increasing. A high degree of factorization also provides a convenient basis for down-scaled regional computations that employ boundary conditions from the state estimate (Sect. 5). Advanced gridding has clear advantages from the standpoint of numerical ocean modeling. It can however put additional burdens on users of ocean model output who Introduction Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | that mimics the gridded earth decomposition of general circulation models in Matlab (Appendix C).

Model configuration
The model configuration presented below is the ECCO v4 setup used in state estimation (Sects. 4 and 5) and based on the LLC90 grid (Sect. 2). Variants of the ECCO 5 v4 setup are also used in un-optimized model simulations (Danabasoglu et al., 2014;Marshall et al., 2014). The setup uses fully supported options of MITgcm software, is archived and regularly benchmarked (Appendix F), and is freely available along with the MITgcm itself (Sect. "Code availability"). The MITgcm, as configured in ECCO v4, solves the hydrostatic, Boussinesq equa- 10 tions (Marshall et al., 1997) using the z * rescaled height vertical coordinate (Adcroft and Campin, 2004) and the vector-invariant form of the momentum equation (Adcroft et al., 2004a). This latter choice yields a discretized momentum equation without metric terms, which simplifies the handling of elaborate grids such as LLC90 (Sect. 2). This section summarizes the model equations, settings, and new MITgcm features (MITgcm 15 Group, 2002;Adcroft et al., 2004b) used in ECCO v4. The novelty here largely resides in additions of forward model features to the body of adjointed codes (Sect. 4) and in their use in the state estimate (Sect. 5). Table 1 provides a list of basic model settings.
The relative importance of various model settings generally depends on the ocean state characteristic of interest. Here, a selection of ocean state characteristics is made 20 amongst model-data differences (see Sect. 4), monthly time series of global mean quantities, and meridional transports (Table 2). These characteristics are used to gauge perturbations of 20 year solutions to various model settings (Table 3). They are also used to benchmark state estimate re-runs and model revisions (first 3 rows of Table 3; Appendix F) and to gauge the sensitivity of the state estimate to adjusted control pa-Introduction

Basic equations
For a water column that extends from the bottom at z = −H to the free surface at z = η, the z * vertical coordinate is defined as z = η+s * z * with the scaling factor s * = 1+η/H. In this section, the notation ∇ z * indicates the nabla operator at constant z * , i.e., in a plane of constant z * value. The z * coordinate set of equations was introduced by Adcroft and 5 Campin (2004) (their Eqs. 9-11 and 13). Written in vector-invariant formulation they read: ∂v ∂t where v is the horizontal velocity, w * = w/s * is the vertical velocity in z * coordinates 4 , k is the vertical unity vector, f and ζ = ∇ × v are the planetary and relative vorticity vertical component, KE is the horizontal kinetic energy, g is gravity, ρ is the density 15 anomaly relative to the constant Boussinesq density ρ c (ρ = ρ c +ρ ), Φ is the pressure anomaly scaled by constant density (Φ = p /ρ c ), θ and S are the potential temperature and salinity, D z * , D ⊥ , D σ are subgrid-scale (SGS) processes parameterized as mixing horizontally, vertically or along iso-neutral surfaces, and F v , F , F θ , F S are the forc-Introduction Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | options improve the model stability, allowing for a longer time step. Thus, the time step restriction due to the Coriolis term in the Arctic is alleviated by the use of AB-3 (∆t = 1 h was unstable with AB-2 and AB ∼ 0.1). Also, the chosen combination of staggered time-stepping and tracer advection schemes increases the stability limit related to internal-wave speed. With these choices a time step of ∆t = 1 h is used (Table 1). 5

Volume and tracer conservation
ECCO v4 uses a non-linear free surface combined with real freshwater flux forcing and the z * coordinate. This approach allows to include material exchanges through the free surface in a physically intuitive way (Campin et al., 2008) and to achieve exact tracer conservation, both locally and globally (Campin et al., 2004). To illustrate this point, it 10 is useful to start from the vertical integral of Eqs.
(3)- (5) are concentrated at or near the surface (unless geothermal heating at the bottom is active) and have been replaced by their integral form in Eqs. (7)-(9), namely the net fresh-water input at the surface (PmE, in kg m −2 s −1 ), the net heat flux into the water column (Q net , in W m −2 ) and the salt-flux at the surface (S flux , in g m −2 s −1 ) which is zero in the absence of seaice and salinity relaxation (see Sect. 3.5).
With the non-linear free-surface, the water column thickness varies as the freesurface goes up and down (as apparent in Eqs. 7-9). With the z * coordinate, this variation is distributed vertically over all grid-cells 5 . The fact that η enters the continuity 5 equation (Eq. 7) also through η −H dz renders the free-surface non-linear; furthermore, time dependent grid-cell thickness introduces many more non-linearities which required code modifications to ensure efficient adjoint code generation via AD (Sect. 4.2).
Earlier ECCO configurations relied on the linear free-surface method (LFS), where column thickness and grid-cell thickness are fixed in time. The LFS version of Eqs. (7) 10 and (9) is: The Goldsbrough-Stommel circulation (Stommel, 1984) can be accounted for by setting FW = 1 (virtual fresh-water) or ignored ( FW = 0). However, since grid-cell thick-Introduction The symmetry between continuity (Eq. 7) and tracer (Eqs. 8 and 9) equations allows for strict tracer conservation (Campin et al., 2004) when discretized consistently (Appendix B). In contrast, in the LFS case, this symmetry is lacking (∂η/∂t in Eq. 10 has no counter part in Eq. 11) resulting in artificial tracer loss or gain (unless a global correction is added).

Tracer transports
Ocean tracers are advected by the residual mean velocity v res , w res (Eqs. 4 and 5). The present ECCO v4 uses the 3rd order DST scheme in the horizontal, and the implicit 3rd order upwind scheme in the vertical. Previous ECCO configurations used the explicit 3rd order upwind scheme in all directions. Flux limited advection schemes are also 10 available in forward mode, although they are not used in the state estimate (Sect. 5), since they are not yet in the body of adjointed codes (Sect. 4). Choices of advection schemes are a concern to ocean state estimation, since their structural properties cannot generally be controlled by continuous parameters, and since numerical diffusion and advective overshoots could preclude an adequate fit to observations. Their impor-15 tance can be gauged from Table 3. Thus, activating flux limiters has a sizable influence over 20 years, which is generally smaller than the impact of activating the C-D scheme, but exceeds the impact of activating geothermal heating for example (Table 3; see next section). Global mean times series often show an exceptionally high sensitivity to a variety of model settings, and to surface boundary layer settings in particular (Table 3). 20 Diffusion includes diapycnal and isopycnal components, the GGL mixed layer turbulence closure (Gaspar et al., 1990), and simple convective adjustment. The latter (GGL and convective adjustment) are used instead of the KPP vertical mixing scheme (Large et al., 1994)  Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | maps for these highly uncertain coefficients become an integral part of the ECCO v4 model setup. In Sect. 5, the sensitivity to these parameters adjustments is evaluated, and compared with the results in Table 3. The geography of K gm , K σ and K d , their impact on stratification, and their observability by means of Argo are further assessed in Forget (2015b).

Momentum discretization
Parameters of the momentum Eq.
(1) currently used in ECCO v4 are provided in Table 1. Lateral eddy viscosity is harmonic and dependent on grid spacing, with coefficient given by 0.25 × µ L 2 /∆t, where µ = 2 × 10 −2 (viscAhGrid in Table 1) is a nondimensional scaling number, L 2 is the spatially varying grid spacing squared ( Fig. 2   10 shows L) and ∆t = 3600 s (deltaTmom in Table 1). The resulting viscosity varies from ≈ 10 3 to 1.6 × 10 4 m 2 s −1 , depending on location. The other dissipation contributions used in ECCO v4 are harmonic vertical viscosity and quadratic bottom drag (with parameters in Table 1) plus contributions from GGL. Previous ECCO configurations used the C-D scheme (Adcroft et al., 1999) that inter- 15 polates the Coriolis term from the Arakawa C grid to a D grid and back. This scheme acts to reduce grid scale noise that is otherwise seen in the vertical velocity fields at all time scales, and particularly in the deep ocean (Fig. 4). Large vertical velocities have adverse effects on adjoint model stability, which ECCO originally resolved by means of the C-D scheme. The C-D scheme does however have a large impact on the large 20 scale ocean circulation (Fig. 4) and hydrography (Table 3). A comparable damping of the barotropic circulation could be obtained through a large increase in viscosity (not shown). Also vertical velocity noise is most intense near the ocean floor, which led us to the inference that adding viscosity more selectively near topography could suffice to damp the vertical velocity noise (Fig. 4) and, along with 25 the use of vertical implicit advection, could stabilize the adjoint. Because the impact of this approach on the circulation and hydrography is more muted than that of the C-D scheme, the latter was abandoned in ECCO v4. Targeted  topography remains needed to stabilize adjoint solutions (Sect. 4), but it can be omitted in forward solutions, as done in the state estimate (Sect. 5).

Surface boundary conditions
Upward buoyancy, radiative and mass fluxes (latent, sensible and radiative contributions to F θ ; evaporation as part of F ) through the free surface are computed using 5 the bulk formulae of Large and Yeager (2004), and 6 hourly ERA-Interim re-analysis fields (Dee et al., 2011) for the near surface atmospheric state (temperature, humidity, downward radiation, precipitation). Downward shortwave radiation is allowed to penetrate, with exponential decay, to a depth of 200 m as part of F θ (Eq. 4). A seasonal climatology of runoff, from Fekete et al. (2002), is added as part of F (Eq. 3).

10
Earlier ECCO configurations using the virtual salt flux approach with FW = 0 (Sect. 3.2) could only account for the dynamical impact of precipitation, evaporation and runoff as they affect buoyancy (see Ponte, 2006). Accordingly, they could only include seaice as a levitating layer without any direct effect on η (Campin et al., 2008). In contrast, ECCO v4 uses the real freshwater flux approach (Sect. 3.2) and thus fur- 15 ther accounts for the dynamical effects of material exchanges through the free surface (either with the atmosphere, land or seaice) as shown in Campin et al. (2008).
Open ocean rain, evaporation and runoff simply carry (advect through the free surface) the local SST and zero salinity in the model. When seaice is present, buoyancy and mass fluxes 6 are recomputed based upon the thermodynamic balance of a fully 20 interactive seaice model (Losch et al., 2010). In this model as configured in ECCO v4, seaice carries 0 • C and 4 g kg −1 salinity, while snow carries 0 • C and zero salinity. The implementation of mass, buoyancy and momentum exchanges through the seaice-ocean interface in the rescaled z * coordinate framework is presented in detail in Campin et al. (2008). A further correction was added in ECCO v4 to ensure conser- 25 vation of heat for the combined ocean+seaice+snow system. While the ocean model is configured to exchange freshwater at the local SST, the seaice model operates at constant internal heat, so it cannot freeze and melt at variable temperature. The added correction simply puts the heat differential back into the ocean. Ocean+seaice+snow budgets (as well as separate ocean, seaice, snow budgets) of mass, heat and salt are then closed to machine precision and readily diagnosed (Appendix C).

5
In centennial ocean model simulations, it is customary to add a Newtonian relaxation of surface salinity to an observation-based climatological map (e.g., Danabasoglu et al., 2014) as part of F S (Eq. 5). While this method has no clear physical basis, it generally adds stability to centennial simulations. In contrast, the state estimate (Sect. 5) has no salinity relaxation term, so that F S = 0 only occurs when seaice (which salinity is set to 10 4 g kg −1 ) melts or freezes. Salt rejected by seaice formation is distributed in the vertical using the parameterization of Duffy et al. (1999) and Nguyen et al. (2009) as part of F S . Wind stress, also from ERA-Interim, is applied directly as part of F v (Eq. 1). A common alternative is to compute wind stress through bulk formulae. This approach typi-15 cally requires backing out adequate drag coefficients -so that the results would approximately match surface stresses that, in atmospheric models, follow from a momentum balance rather than bulk formulae -or ad-hoc adjustments of atmospheric variables (see, e.g. Large and Yeager, 2004;Risien and Chelton, 2008). In the context of state estimation, the direct specification of wind stress further allows for a clear distinction 20 between momentum controls on the one hand, and buoyancy and mass controls on the other hand (Sect. 4).

Estimation framework
The state estimation problem is defined here by a distance to observations (J) to be minimized under the constraint of a dynamical model. Section 4.1 formulates the state 25 estimation problem in more detail. Section 4.2 provides an overview of the MITgcm adjoint, which is instrumental in solving the state estimation problem, and its recent devel- an approximate minimum of J. The process of finding such solutions, typically through an iterative optimization process and using the adjoint model, is not a focus of this paper. A number of well known optimization methods, with third party implementations freely available online, can be used to this end (see, e.g. Heimbach et al., 2005, and references therein).

10
Note that the existence of a unique global minimum of J is only rigorously established for linear least squares, when it can be solved for in matrix form to machine precision. In contrast, for non-linear inverse problems 7 one can only aim to find at least one approximate minimum of J that is an acceptable fit to the data (i.e. a fit within specified errors).

Problem formulation
State estimation consists in minimizing a least squares distance, J(u), that is defined as where d i denotes a set of model-data differences, α i the corresponding multiplier, R −1 i the corresponding weights, u j a set of non-dimensional controls, and β j the corresponding multiplier. Additional symbols appearing in Eqs. (13)-(16) are defined below. The implementation of Eqs. (12)-(16) and the adjoint interface within the MITgcm is charted in Fig. 5.

5
Model counterparts (m i ) to observational data (o i ) derive from a set of adjustable model parameters (v) through the model dynamics (M), diagnostic computations (D), and subsampling or averaging in space and time (S), performed as the forward model steps through time (Eq. 14). Model-data misfits are then computed, upon completion of the forward model simulation, in order to evaluate J(u) and provide the adjoint model The control problem, as implemented in ECCO v4, is non-dimensional, as reflected 15 by the omission of weights in control penalties (u T j u j , Eq. 12). Non-dimensional controls are scaled to physical units through multiplication by their respective uncertainty fields, as part of the generic pre-processor Q (Eq. 15; Sect. 4.4). Pre-conditioner R (Eq. 16) does not appear in the estimation problem itself (Eq. 12), as it only serves to push an optimization process preferentially towards certain directions of the control space. 20 The specification of (always approximate) error covariances (e.g. R i ) is a key ingredient of ocean state estimation, and least squares in general. ECCO has contributed a large body of work in this respect (e.g. Forget and Ponte et al., 2007;Ponte, 2008, 2010;Chaudhuri et al., 2013;Forget and Ponte, 2015). Although not a focus in this paper, the difficulty in providing accurate error covariances, and assessing their impact on the state estimate, requires careful analysis of misfit residuals after the fact. This process typically leads to another phase of state estimate production, and so-forth. For problems as massive as ECCO v4 (see Tables 5-7), full error covariance matrices are impractical and will remain so for the foreseeable future. Matrix free approaches are of great practical value in this context. For example, the method of Weaver and Courtier (2001) is used in ECCO v4 to specify control parameter adjustment scales (Sect. 4.4) and penalize large-scale model-data misfits (Forget and Ponte, 2015). 5 Within pure linear least-squares theory, under the unrealistic assumption of perfect error covariance specifications, multipliers α i , β j should be omitted from Eq. (12). They are, however, adequate in practice as a means to partly compensate for approximations in error covariances, and the neglect of R i non-diagonal terms in particular. They also provide a practical means to accelerate the optimization of data sets introduced in 10 J during later stages of optimization. Furthermore, u T j u j (in Eq. 12) essentially are regularization terms included to limit control parameters adjustments, and the β j multipliers provide the corresponding trade-off parameters (Hansen, 1992).

Adjoint modeling
The method of Lagrange multipliers (i.e. the adjoint method) and its application to nu-15 merical models being stepped forward in time is well documented elsewhere. In particular, the interested reader is referred to Thacker and Long (1988) for a succinct presentation, with application to the case of a simple wave equation. The fitting of model sea level variability to altimetry through forcing adjustments estimated by the adjoint method (see, e.g. Forget and Ponte, 2015) is analogous to the simple case treated in 20 Thacker and Long (1988). A crucial advantage of this method, as used in ECCO, is that it avoids adding source/sink terms of unknown nature to the model equations 8 . Adjoint models have many useful applications in their own right, and we shall list a few that are particularly relevant to ECCO. Introduction Integrating adjoint models over extended periods of time allows diagnosis of the sensitivity of model dynamics to various parameters. Two examples are provided in Fig. 6 pertaining to the tracer (left panels) and momentum (right panels) equations that were computed using the "autodiff" (Fig. 5; this section), "profiles", "ctrl" and "smooth" (Fig. 5; subsequent sections) MITgcm packages. Figure 6 illustrates that the sensitiv-5 ity of model-data misfits (here they cover 2008-2010) extend far back in time (here to 1992). The ability to use information contained in observations backward in time is a powerful advantage of the adjoint method over sequential/filtering assimilation methods. Such adjoint sensitivities provide a practical means to reduce spurious model drifts and biases, through inversion of uncertain model parameters (see, e.g. Ferreira et al., 10 2005). In cases that are sufficiently linear, adjoint sensitivities to e.g. wind stress can further be convolved with forcing anomalies to reconstruct and attribute variability in the ocean circulation (see, e.g. Fukumori et al., 2015).
Unlike the simple case treated in Thacker and Long (1988), hand-coding the adjoint of the MITgcm would be a very tedious and daunting task. Algorithmic differen- 15 tiation, through a source-to-source code transformation tool, is a powerful alternative (see Griewank and Walther, 2008). Computational aspects of algorithmic differentiation applied to the MITgcm are described in Heimbach et al. (2005). Since its origin, ECCO has relied on TAMC (Tangent Linear and Adjoint Model Compiler Giering and Kaminski, 1998) and its commercial successor TAF (Transformation of Algorithms in 20 Fortran; Giering et al., 2005). Open source tools such as OpenAD (Utke et al., 2008) and Tapenade (Hascoët and Pascual, 2013) are on their way to provide alternatives for massive problems such as ECCO (Heimbach et al., 2011). The balancing of storage vs. recomputation via the checkpointing method is essential to computational efficiency (Griewank, 1992;Heimbach et al., 2005). This is particularly true for ECCO v4 since Introduction CS and LLC (Fig. 1). More generally, development of efficient adjoint code using TAF largely consists in accommodating non-linearities of added forward model features.
Overwhelmingly expensive recomputations of non-linear terms in the adjoint are treated by adding TAF storage directives 9 . These directives take the form of fortran comments (starting with "CADJ") embedded in the forward model code, which TAF 5 transforms into sure code for storage operations (for details, see Heimbach et al., 2005). The ECCO v4 set-up involves 1458 such comments, which were all inserted manually in carefully chosen locations. Once all of the needed storage directives are in place, then "algorithmic differentiation" becomes the "automatic differentiation" that an ECCO v4 user holding a TAF license will experience.

10
The non-linear free surface, the Adams-Bashforth 3 time stepping scheme, and implicit vertical advection were thus added as adjoint capabilities as part of ECCO v4. Including the non-linear free surface, along with the real freshwater flux boundary condition, in the ocean state estimate is regarded as a major improvement in physical realism. The Adams-Bashforth 3 and implicit vertical advection schemes have a mi- 15 nor impact on the forward model solution but provide additional stability also in adjoint mode.
Exactness and completeness of the adjoint is the general goal of the MITgcm adjoint development. Exactness can be of particular importance to carry quantitative analyses of adjoint sensitivities (e.g. Verdy et al., 2014;Fukumori et al., 2015). For state 20 estimation purposes, however, it is often advantageous, or simply convenient, to use an approximated adjoint (see, e.g. Jiang et al., 2002). The most basic approximation consists in switching off forward model features in the adjoint, which allows postponing the development of a stable adjoint.
In ECCO v4, the Gaspar et al. (1990), Nguyen et al. (2009)  . It is only the parametric dependency of these diffusivities and viscosities on the ocean state that is omitted. Until 2008 applications of the MITgcm adjoint were also omitting the Redi (1982) and Gent and Mcwilliams (1990) components, which precluded optimal control of their parameters. This situation was resolved by using a simple clipping scheme for large isopycnal slopes, and by omitting only the parametric 5 dependency of isopycnal slopes on the ocean density field in the adjoint, following a reasoning similar to that of Jiang et al. (2002). Thus, the parametric dependency of turbulent transports on K gm , K σ and K d is retained in the adjoint, so that they can be optimally controlled.
Beyond the removal of unstable adjoint dependencies, other alterations of the adjoint 10 are of practical value for optimization purposes. In particular, it is common practice to increase viscosity parameters to add stability to MITgcm adjoint simulations (Hoteit et al., 2005). Despite successful adjoint simulations with particular versions of the sea ice model (Heimbach et al., 2010;Fenty and Heimbach, 2013), the seaice adjoint is omitted in ECCO v4 due to persisting issues. A pseudo-seaice adjoint is introduced in-15 stead to account at least for the most basic effect of seaice -the shielding of sea water from the atmosphere. The adjoint pseudo-component is obtained by AD of a forward pseudo-component. The forward pseudo-component merely tapers air-sea fluxes to zero according to (1 − a) where a is the seaice fraction computed by the actual forward seaice model. This gross, local approximation omits the thermodynamics and dynam-20 ics of seaice, and is never used in forward mode. In the adjoint, it masks out open ocean adjoint sensitivities that do not apply where ice cover is present. A fraction of open ocean sensitivity is preserved at the ice edge, which is physically reasonable and avoids a discontinuity in adjoint fields. The pseudo-seaice adjoint approach has been extended in the context of Arctic ice-ocean state estimation (A. Nguyen, personal com-

Observational constraints
Ocean state estimation involves imposing data constraints upon ocean models. Modeldata comparison (i.e. computing Eq. 12) becomes an integral part of numerical modeling. In forward mode, "ecco" and "profiles" are diagnostic packages that can be used in any MITgcm run to perform model-data comparisons and to compute Eqs. (12)-(14).

5
In adjoint mode, they take the role of providing the adjoint model forcing (see Fig. 6).
In situ data constraints are handled by the "profiles" package. Model-data comparison is computed at the time-step and grid point nearest to each observed profile (see Appendix D). Aside from the primary goal of carrying out state estimation, the "profiles" output permits direct and rigorous assessments of modeled and observed statistics 10 (and how they may differ) based upon a near identical and instantaneous sampling (e.g., see Forget et al., 2011). To this end, it alleviates the need to output global fields at full temporal resolution, which becomes overwhelming at high spatial resolution.
Gridded data constraints 10 are commonly based upon monthly or daily averaged fields and handled by the "ecco" package. Many features have been added to "ecco" 15 over the course of the ECCO v4 development. In preparation for this paper, these features were generalized so they can immediately be applied, when adequate, to any gridded observational constraints. As of MITgcm's checkpoint65h, the generic "ecco" capabilities are those listed in Table 4. In general, observable quantities (m i in Eq. 13) are diagnosed from model state 20 variables (via operator D in Eq. 14). For potential temperature and salinity ("theta" and "salt" in Table 4) the corresponding model state variables (θ and S in Eqs. 4 and 5) are simply time averaged, and D then simply denotes the identity operator. In contrast, sea surface height ("eta" in Table 4) is diagnosed as η + η ips + η nbs where η is the model free surface (see Sect. 3.1), η ips is the weight of sea ice plus snow per unit area divided 25 by ρ c (see Campin et al., 2008), and η nbs is a global steric sea level correction to the Boussinesq model (see Griffies and Greatbatch, 2012). Furthermore, for comparison of sea surface height with altimetry, the time mean of m i −o i computed at each grid point, and the time variable global mean of m i − o i , are further subtracted via post-processor P in Eq. (13) (see Forget and Ponte, 2015). The basic steps in imposing e.g. a gridded data constraint using the "ecco" package 5 are: 1. Mapping observational data (whether along satellite tracks, gridded, or interpolated) to the model grid, which is easily done e.g. in Matlab using gcmfaces (Appendix C).
2. Specifying the error covariances (R i in Eq. 12) of model-data misfits (d i in Eq. 12).
To accommodate the great ocean heteroscedasticity (e.g. see Forget and Wunsch, 2007), spatially varying uncertainties are generally needed.

Carrying optimization until convergence to an approximate minimum of J(u).
It should be stressed that all three steps are required to claim that an observational constraint has effectively been imposed on a state estimate, and that the specification 15 of errors is the central scientific problem. This is also true for "profiles" although the first step is limited to a vertical interpolation to standard levels in this case. Table 6 provides the list of gridded observational constraints that, along with the in situ data constraints listed in Table 5, have been imposed on the ECCO-Production, release 1 state estimate (Sect. 5).

Control parameters
Within the MITgcm, the "ctrl" package ( Fig. 5) handles adjustable control parameters (u in Eq. 15). In forward mode, "ctrl" is a package that influences the ocean state evolution (Eq. 14). Activating a new control parameter only requires a few lines of codes to map it to corresponding model parameters (Eq. 15). In adjoint mode, "ctrl" takes the diagnostic Introduction A penalty can further be added to J(u) by setting β j > 0 accordingly (Eq. 12), which will act as an adjoint forcing, to constrain the magnitude of control parameter adjustments. Most features in "ctrl" were recently generalized so they can readily be applied, when adequate, to any set of controls. The generic pre-processor Q (Eq. 15) may thus include the Weaver and Courtier (2001) spatial correlation model (Appendix E), the cyclic 5 application of climatological mean controls and/or a rotation of (zonal, meridional) vectors to the model C grid. Control parameters used in the state estimate are reported in Table 7.
Most generally, complete and accurate error covariance estimates are lacking for control parameters. For all controls used in the state estimate (Table 7) the error corre-10 lation scale was simply specified as 3 times the grid scale using the "smooth" package (as part of Q; Appendix E). The estimation of an initial state that pre-dates Argo and of its uncertainty, given the sparsity of the ship-based ocean sampling, is a difficult problem in itself that is proposed for further, dedicated investigation (e.g., see Forget, 2010; Lyman and Johnson, 2014). 15 For atmospheric re-analyses fields, in the absence of formal error estimates, ad-hoc specifications of Q are based upon the spread of available atmospheric variable estimates Chaudhuri et al. (2013). Here the squared sum of time mean and seasonal differences between NCEP and ERA-Interim fields was computed, then capped to a maximum, and used as an ad-hoc estimate of error variances in atmospheric controls. 20 For K gm , K σ and K d , the first guess values were 10 3 , 10 3 and 10 −5 m 2 s −1 , respectively. The corresponding uncertainties were set to 500., 500. and 10 −4 m 2 s −1 . The adjusted parameters were further imposed to stay within 10 2 < K gm < 10 4 , 10 2 < K σ < 10 4 , and 10 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | that regional turbulent transport parameter inversions have an observational basis in Argo data.

State estimate
The ECCO-Production, release 1 state estimate covers the period from 1992 to 2011 and is the baseline solution of the ECCO v4 forward model setup (Sects. 2 and 5 3), including the parameter adjustments derived from observational data constraints (Sect. 4). Its monthly mean output and model-data misfits are publicly available online. Several recent publications (Speer and Forget, 2013;Heimbach, 2013b, 2014;Buckley et al., 2014;Forget and Ponte, 2015;Balmaseda et al., 2015) analyze earlier iterations (see Appendix G). The solution fits altimetry (Forget and Ponte, 2015), 10 SST (Buckley et al., 2014) and subsurface hydrography data (Sect. 5.2) at or close to the specified noise level. Many characteristics of the solution have been analyzed in some detail and found physically plausible, which warranted its public release. The reader is further referred to the extensive documentation (the "standard analysis" provided as Supplement) of model-data misfits and physical characteristics of the state 15 estimate that is also publicly available online.

Select characteristics
The characteristics in Table 2 are a small subset that is representative of the multifaceted nature of ocean state estimation. To shed light on the observational and climate problems, this section assesses the sensitivity of these characteristics, and correla-20 tions amongst characteristics, as measured within Table 3 (model settings) and within Table 8 (adjusted controls). The different levels of sensitivity seen in Table 8 vs. Table 3 is addressed in Sect. 5.3, and pertains to controllability rather than observability. The various observational constraints (the first seven characteristics) show contrasting levels of sensitivity to control parameter adjustments (  (Table 3). This behavior may reflect different levels of random errors amongst the different types of observations. In particular, the subsurface hydrography, as constrained by jT and jS, appears as the most sensitive model-data distance (Tables 3-8). Another noteworthy result is that global mean time series show more spread than do time averaged meridional transports (Tables 3-8 reporting normalized 5 differences). The main exception to this behavior is for the meridional salt transport, whose time average is small in the state estimate (Fig. 7). The correlation (or lack thereof) between columns of Tables 3-8 also yields salient conclusions ( Fig. 8). High correlations between observational constraints (bottom panels) is suggestive of some redundancy between data sets (i.e. consistency amongst observations). High correlations between meridional transports and observational constraints (top and middle right panels) provides evidence that Argo and altimetry may efficiently constrain heat and freshwater transports (see also, e.g. Forget et al., 2008a, b).
In contrast, low correlations between global mean time series and distances to ob- 15 servations is striking (Fig. 8, top and middle left panels). Given that the time variable global mean model-data difference is omitted in computing jHa, the low correlation between mH and jHa indicates that a given global mean sea level time series could be associated with many regional solutions with equal uncertainty. The low correlation between jT and mT may further reflect that regional variations can be much larger than, 20 and not necessarily related to, temporal changes in global mean properties.
Beyond the present study, the extent to which Argo and altimetry, amongst others, constrain temporal changes in global mean properties remains unclear. A related concern is that global mean properties are the most sensitive to discrete choices in model numerics and physics amongst the selected characteristics (see Table 3). It is tempting 25 to attribute this behavior to the omission of atmospheric, continental, etc. hydrology modeling in ECCO v4, although this remains to be proven. Whether and how the large sensitivity of global mean properties seen in Fig. 7 translates into simpler models used to quantify climate change from observations (as in, e.g. Purkey and Johnson, 2010

Improved hydrography fit
In developing and producing the state estimate, a primary goal was to improve the fit to observed in situ profiles (of T and S) as compared with earlier ECCO estimates (see 5 Forget, 2010). As already apparent in Table 3, the inclusion of parameterized physics as controls (i.e. adjustable model parameters) was instrumental in achieving that goal. The present section focuses on this defining characteristic of the baseline ECCO v4 solution.
The fit to in situ profiles of temperature and salinity is depicted in Fig. 9  suspected that jT and jS could be further reduced. Values of 1.5, however, are regarded as sufficiently low to justify analysis of the state estimate water masses (Speer and Forget, 2013) and stratification (Forget, 2015b). Furthermore, jT and jS are already much reduced (by a factor of 2 to 10) compared with earlier ECCO estimates ( Fig. 9) throughout the period from 1992 to 2011. Amongst earlier ECCO estimates, the ECCO 20 v3 solution comes the closest to the observed hydrography with a typical distance to observations of 3 (Fig. 9). The contrasts in jT and jS amongst solutions ( Fig. 9) reflect large scale misfits as illustrated in Fig. 10. This is equally true for ECCO2 eddying solutions (bottom panels) and for coarser model solutions (top and middle panels). Such broad misfit patterns 25 typically denote spurious model drifts and biases, which are common symptoms of model deficiencies (Stammer, 2005;Ferreira et al., 2005). Similarities in misfit patterns amongst ECCO2 eddying solutions (using a common model set-up, under  and 10) tends to be reduced near the sea surface (not shown), which is encouraging but not entirely surprising since surface forcing fields were already control parameters in earlier solutions. Conversely the contrast in misfit amplitude tends to increase with depth (not shown), where internal model error sources may predominate. Within ECCO v4, jT and jS are particularly sensitive to estimated turbulent transport parameter adjustments and generally less sensitive to estimated surface forcing adjustments, with the exception of expectedly high salinity sensitivity to precipitation (see Table 8, first two columns). This result is in contrast with the analysis of Liu et al. (2012) who suggest that parameterized physics are only marginally important in this regard, a suggestion consistent with the relative weakness of their turbulent transport param- (1) argo largely increased the volume of in-situ constraints, (2) slow models drifts are more prominent in longer unconstrained solutions. One should expect larger turbulent transport parameter 20 adjustments on both counts. Amongst turbulent transport control parameters in ECCO v4, jT and jS are most sensitive to the K gm adjustments (this result is in agreement with Liu et al., 2012). A caveat should be noted though: parameterized surface and interior fluxes are all interactive so that any control vector adjustment can potentially affect any surface or interior flux. 25 Hence Table 8 should not be mistaken for a precise ranking of the importance of the various controls. It clearly shows, however, that turbulent flux parameter adjustments were instrumental in fitting observed hydrographic profiles in ECCO v4.

Parametric and structural model error
In this section, the focus is on model uncertainty and controllability, which directly impacts the possibility of fitting a model to observed data. Random data errors and model representation errors are left out of the discussion, which are comparatively well studied (e.g. Forget and Ponte et al., 2007;Ponte, 2008, 2010;5 Chaudhuri et al., 2013;Forget and Ponte, 2015). Errors associated with computing environment changes (top three rows in Table 3) are generally small enough to be neglected when using the MITgcm. The interplay of external, structural and parametric ocean model errors has never been tackled in any systematic and quantitative manner. To distinguish amongst model 10 uncertainties associated with ECCO v4 settings, we propose the simple, practical category definitions in Table 9. Clearly the separation between these three categories leaves room for ambiguities. For example, selecting one of the available atmospheric re-analysis products to force the model may fall under "structural", while tuning bulk formulae coefficients may fall under "parametric" and adjusting re-analyzed fields may fall 15 under "external". Nevertheless, as a starting point, the above definitions provide a useful frame of reference. A related discussion can be found in Marzocchi and Jordan (2014), although the focus here is on curve fitting (i.e. interpolation within a time period) rather than on forecasting (i.e. extrapolation forward in time). Relevant discussion can also be found in Danabasoglu et al. (2014) and Balmaseda et al. (2015). 20 A first assessment of the relative importance of external, parametric and structural model uncertainty in ECCO v4 can then be made from ters generally have a much larger impact (Table 8) than switching amongst numerical schemes (Table 3).
A ratio C of model uncertainty controlled by continuous parameters (external or parametric) to structural model uncertainty is introduced to better illustrate this result (Fig. 11). The adjoint method allows for reduction of parametric and external errors, but it does not lend itself to reduction of structural errors that are fundamentally discontinuous. Hence, C is an index of model controllability, which can be interpreted as a signal to noise ratio of sorts, but for model simulations rather than observations. Large values of C are a priori favorable to state estimation.
It is therefore encouraging that log 10 (C) > 0 for all variables considered (Fig. 11), showing that controlled model uncertainty exceed the noise level set by structural model uncertainty. Certain ocean characteristics are particularly prone to structural model uncertainty, whereas others are highly controllable. On the one hand, modeldata distances for regional sea level variability and in situ hydrography appear most controllable with log 10 (C) > 1.5 (top panels). On the other hand, global mean temper-15 ature and sea surface salinity appear most prone to structural model uncertainty with log 10 (C) < 0.5. The high level of structural uncertainty seen in global mean heat uptake (i.e. mT) is cause for concern in the context of climate change monitoring (see also ity. For example, KPP is a very complex and non-linear parameterization that involves many discrete switches and thousands of code lines. GGL yields broadly similar results to KPP over 20 years (Table 3) and is in contrast a very simple code, so that a practical adjoint may be within reach. It is also noteworthy that activating the C-D scheme generally trumps the impact of switching between mixed layer schemes, albeit with the 5 notable exception of global mean characteristics (see Table 3). This result highlights the potential benefits of further extending the inversion problem to viscosity parameters.

Known issues
State estimation should aim towards universality and completeness (see Wunsch and Heimbach, 2013a, for a review). Thus, its practice always warrants continuous improve-10 ment in many respects. In ECCO v4, without trying to be exhaustive, one can distinguish at least three types of issues. Firstly, the ECCO-Production, release 1 state estimate would benefit from further optimization, with additional data sets, controls, and refined error covariance specifications. The addition of turbulent transport parameters is a step forward, but their 15 specified covariances remain very imprecise. Parametric error in the momentum equations also deserves further attention, since it may limit model controllability. Error covariances between adjustable control parameters (e.g. atmospheric variables) are also neglected. A permanent issue is the need for additional data constraints, particularly in the abyss (Wunsch and Heimbach, 2014). Amongst available data that is not yet in 20 ECCO v4, the growing bio-geochemistry data base is becoming a priority.
Secondly, the lack of "posterior" error estimates is regarded as the most outstanding issue with ECCO-Production, release 1. Producing formal error estimates, at a reasonable computational expense and with acceptable precision, for the full, evolving ocean state would be another major breakthrough. In principle, a number of methods are Introduction sian distribution, can be readily related to the posterior error covariance (see Kalmikov and Heimbach, 2014). Also a possibly useful estimate of uncertainty in ECCO v4 may follow from computing the spread amongst available ocean data syntheses, although it is unclear how such ensemble spreads should be interpreted (Balmaseda et al., 2015). Thirdly, the ECCO v4 model setup could be extended and improved, with possibly 5 important implications for the state estime. The lack of atmospheric, land, and biogeochemistry components is an obvious limitation of ECCO v4 at this stage. The surface boundary conditions and seaice model settings require further assessment. Issues such as the use of the Boussinesq approximation (in Eqs. 1-5), the omission of geothermal heating (Piecuch et al., 2015), and the lack of a coastal wetting/drying 10 mechanism are matters for further MITgcm development that are also of importance to state estimation.

Conclusion and perspectives
This paper emphasizes the synergy between ocean modeling and data analysis. The entanglement of models and observations is nothing new -Ekman (1905), Sverdrup

15
(1947), Munk (1966) and Wunsch (1977) are just a few historical examples. The synergy of ocean modeling and data analysis is further becoming a reality as a growing community engages in ocean state estimation, which in essence is the hybridization of ocean modeling and data analysis. What is different now merely is the level of (in)completeness, complexity, and diversity of the models and observations being 20 employed in modern oceanographic and climate science. The scope and size of the ocean state estimation problem tackled in ECCO v4 requires collaborative research and production activities. This unescapable conclusion leads to this attempt at offering ECCO v4 as a fully integrated framework for non-linear inverse modeling and global ocean state estimation. Along with the MITgcm and its adjoint capability, the ECCO v4 Each component of the framework is being (re)designed to be modular and of general applicability, as they all are thought to provide valuable stand-alone pieces to different degrees. Standardized in-situ data sets in particular, while a by-product of carrying out ECCO v4, allow for a variety of scientific analyses in their own right. For example they are used for analyses of observed variance that is never fully represented in nu-5 merical model solutions (Forget and Wunsch, 2007;Forget, 2015a), of water masses volumetric census (Forget et al., 2011;Speer and Forget, 2013), and of macro turbulence (McCaffrey et al., 2015) and mixing (Forget, 2015b). A complementary description of the standardized in situ observations and related ECCO v4 components is provided in Appendix, directed towards users of in-situ observations. 10 As another example, the gcmfaces Matlab framework (Appendix C) is suitable for the analysis of gridded earth variables (whether observational or modeled) beyond the ECCO v4 model setup and state estimate. At this stage it has already been applied to analyze MITgcm simulations on various grids, and to a variety of observational data sets. Interfacing gcmfaces with output from models other than MITgcm would allow 15 for rigorous model intercomparisons without the need to introduce errors through interpolation. As a final example, any interested modeling group should be able to take advantage of the global grids.
The state estimate and the MITgcm are highly integrated with each other. Beyond the few aspects of the solution that have been investigated in some detail, the MITgcm 20 provides numerous prognostic and diagnostic capabilities that remain to be applied to, or employed within, ECCO v4. The "ctrl", "ecco" and "profiles" packages, are just examples of the many MITgcm packages. The last two diagnose model-data misfits and statistics. In contrast, the "ctrl" package defines control parameters that act upon the forward prognostic equations. It also lends itself to development of new parameteriza- 25 tions. Note that the roles of these packages (diagnosing or acting on the solution) are reversed in the adjoint. Amongst forward prognostic MITgcm packages not yet used in ECCO v4, biogeochemistry and simplified atmospheres (Dutkiewicz et al., 2005;Follows et al., 2007;Marshall et al., 2007a;Ferreira et al., 2011)  out, as they offer a great potential for extending ocean state estimation. The adjoint capabilities of MITgcm further allow for computations of sensitivity, Green functions, singular value decomposition, mechanistic attribution of variability, optimal observation design (Marotzke et al., 1999;Köhl and Stammer, 2004;Fukumori et al., 2007Fukumori et al., , 2015Heimbach et al., 2011;Zanna et al., 2011).
Furthermore, the MITgcm provides a convenient platform for parallel computing and variational estimation that allows for, but is not limited to, ocean data synthesis and analysis (Hoteit et al., 2013;Goldberg and Heimbach, 2013). Optimal interpolation (OI) of an individual variable, for instance, can readily be carried out using Eq. (12) and its adjoint with M = I (i.e. the identity operator) as illustrated by Forget (2010). In 10 between OI and full ocean state estimation, and beyond, lie many interesting stages and possibilities. For instance, stand alone bulk formulae configurations (available at mitgcm.org, with or without seaice) could readily allow for assessment and optimization of air-sea fluxes (along the lines of, e.g., Yu and Weller, 2007;Maze et al., 2009). The (re)implementation of Eq. (12) within MITgcm provides a versatile environment for such 15 projects, and for variational estimation purposes most generally (and is complementary to, e.g. Barth et al., 2014;Wilson et al., 2014;Hoppe et al., 2014).
It is expected that any of the ECCO v4 components listed in Table 10 will eventually be replaced. Most immediately, the specifics of the ocean state estimation problem (grid, forcing, ocean and seaice model settings, control parameters, observational 20 constraints) can all be refined or substituted for improved components. Our continued commitment is to make every updated component freely and fully available online as soon as possible. All of the Fortran and Matlab components are already available, and served through the CVS server of MITgcm, where they were added in real time and with free access over the years (Sect. "Code availability"). The state estimate monthly out-25 put and the model-data distance (data, model counterparts, and uncertainty) for in situ profiles are also readily available. The rest of the numerical input and output requires additional processing and web interfacing -and is for now instead made available upon email request (Sect. "Code availability"). Furthermore, at the present time, taking full advantage of the ECCO v4 framework (Table 10) requires two third party commercial tools that are neither free nor open source: Matlab and TAF. The ability to successfully generate efficient adjoint code using alternative open-source tools, such as OpenAD or Tapenade is gaining increasing priority. Despite its limitations, Matlab is one of the most portable, integrated and popu-5 lar analysis framework, and it is expected to remain as such for the foreseeable future. However, a Python analysis framework similar to gcmfaces is in planning and should better handle massive output from high resolution models (R. Abernathey, personal communication, 2014).

GMDD
Gridded observational products (such as hydrography climatologies, ocean state es-10 timates, etc.) are commonly used as a practical shorthand to observations. It should be stressed that a gridded field in itself does not provide any information about its errors. Therefore, and since direct observational constraints are unevenly distributed and restricted to a few variables, users of the state estimate are strongly encouraged to consider the underlying observational data base. This being said, and despite the 15 need for continued improvement, the usefulness and scientific value of the ECCO v4 solution is by now largely documented in a number of papers (Speer and Forget, 2013;Heimbach, 2013b, 2014;Buckley et al., 2014Buckley et al., , 2015Forget and Ponte, 2015;Forget, 2015a, b;Liang et al., 2015;Fukumori et al., 2015;Balmaseda et al., 2015). 20 As compared with earlier ECCO solutions, the ECCO-Production, release 1 state estimate (i.e. the baseline ECCO v4 solution) benefits from an extensive revisit of model settings. The improved fit to in situ hydrography (Argo profiles of T and S in particular) as compared with earlier ECCO solutions may be the defining characteristic of ECCO-Production, release 1. The inclusion of turbulent transport parameters in the set of 25 adjustable control parameters was instrumental in achieving that goal -their inversion from hydrography observations is further assessed in Forget (2015b). Nevertheless, it should not be assumed that broad scale misfits to observations are completely absent (e.g. see Fig. 10 being provided with capabilities to assess model-data misfits for themselves. More generally, it should not be assumed that all ocean state variables are fully constrained by observations. Integrated transports, global averages, etc. are not directly observed, and it is a priori unclear how well they can be constrained by available observations (see Forget et al., 2008a, b;Heimbach et al., 2009;Forget, 2015b). 5 Looking to the future, the need for associating formal error estimates with the full, evolving ocean state remains of utmost importance. Aside from this aspect, extensions of the state estimation framework to include other climate components (atmosphere, land, cryosphere) and different variables (biology, chemistry) would be desirable (see, e.g. Blessing et al., 2014;Prinn et al., 2011). By providing ECCO v4 as a fully integrated framework along with a useful baseline solution that any interested investigator should be able to reproduce for the foreseeable future, the authors aim to stimulate independent research along those lines.
The overarching scientific problem (set aside technicalities) to data-model combination lies in the attribution of errors amongst the various elements of Eq. (12). We make 15 no claim to having achieved the proper attribution of errors, but experience gathered in developing ECCO v4 suggests that a paradigm shift, as compared with earlier ECCO publications, is in order. Our results indeed indicate that internal parameters are of first order importance to state estimation, and to fitting the observed hydrography in particular (Table 8). Our assessment is in contrast with that of Liu et al. (2012) who 20 suggest that the importance of internal parameters is of order 10-20 % depending on the model variable of interest. Furthermore the inversion of parameters in the momentum equations, which has received comparatively little attention, emerges as a topic of importance as one gets closer to observed data, and is expected to gain further importance as resolution increases. To provide a frame of reference for future research 25 along those lines, a first attempt at defining and gauging various categories of model uncertainty has been presented.
Alleviating structural model errors is a prerequisite to improved dynamical interpolation of observations. In this regard, the main improvement compared with previous Parametric and external model uncertainty (Table 8) generally appear to dominate over structural model uncertainty (Table 3) as illustrated by Fig. 11. Such a conclusion most likely depends on spatial resolution, the chosen 20 year duration, and the necessarily limited array of model settings being considered in Tables 3-8. In particular, we expect that the choice of momentum schemes would be more important in eddy- 15 resolving models, as kinetic energy overcomes potential energy at the meso-scale. Examples of large structural uncertainty in eddy permitting models can be found in Barnier et al. (2006) and subsequent studies. Here, however, the estimated control parameter adjustments appear to determine the solution beyond the level of structural model uncertainty (Sect. 5.3). 20 Parametric model uncertainty (associated here with interior turbulent transports) and external model uncertainty (associated here with surface forcing fields) appear to be of comparable magnitude (Table 8). Depending on the characteristic of interest, one predominates over the other. Hence, the importance of including turbulent transport parameters, which are highly uncertain, in the control vector cannot be overstated. Introduction

Appendix A: Grid generation method
At high-latitude, the LLC mesh is generated numerically by adapting the two dimensional conformal mapping algorithm developed by Zacharias and Ives in the 1980s (see Ives and Zacharias, 1989;Trefethen, 1989;Wilkin and Hedström, 1998) to spherical geometry. The approach is similar to that used in the SeaGRID package (Den- To numerically mesh each sub-domain it is first conformally projected onto a plane, using a polar stereographic transformation. The result is then conformally mapped to a rectangular shape by iteratively applying the so-called "hinge-point" or "power" transformation to each of the four arc segments that make up the sub-domain edges. The 15 transformation works with points (x, y) in the complex plane x + i y and applies the mapping ω = (x +i y) P . The transformation is applied iteratively to adjacent pairs of discrete line segments that define the sub-domain edges. The transformation adjusts P at each iteration for successive line segment pairs, so that the angle between adjacent segments is adjusted to be π 2 at corners and π for all intermediate segments. 20 The result of the transformation is a rectangular shape in a new coordinate space denoted by coordinates ζ and η. The rectangular shape has two edges that are line segments of constant ζ and two edges that are lines of constant η. The points that define the line segments have corresponding mappings to the line segment points in the original (x,y) coordinate system. A set of x and y locations that describe orthogonal Laplace equations (Ryskin and Leal, 1983) of the form over the (ζ ,η) rectangular shape and subject to the respective boundary conditions X = x and Y = y on the respective ζ = constant and η = constant rectangular shape 5 edges.

Appendix B: Time stepping
The time-discretized version of Eqs.
(1)- (5) and (7) calculate the updated state (v n+1 , w n+1 , η n+1 , θ n+3/2 , S n+3/2 ) at time t + ∆t from the current state at time t (v n , w n , η n , θ n+1/2 , S n+1/2 ) following: where u represents the three components velocity vector (u, v, w), u b the bolus velocity and A( ) denotes the advection term. Momentum advection and the Coriolis term are evaluated at time t from v n , w n in 5 G n v = −(f +ζ )k×v −∇ z * KE−w ∂v ∂z and the resulting tendency (G n v ) is extrapolated forward in time to t + ∆t/2 using the Adams-Bashforth 3 (AB-3) scheme: Here we use (α AB , β AB ) = (1/2, 0.281105) to improve the stability (Shchepetkin and McWilliams, 2005) compared to the true 3rd order in time Adams-Bashforth 10 (α AB , β AB ) = (1/2, 5/12). The precision of the scheme drops to just 2nd order accuracy with little consequences here since most of the other terms are also 2nd order in time (tracer time-stepping, internal-waves dynamics). Note that the precision is still improved compared to the quasi-AB-2 used in previous ECCO configurations which become only 1rst order accurate with stabilization factor ( AB ∼ 0.1).
Simple eulerian time stepping (1st order, forward in time) is used in D n z * ,v for horizontal dissipation (harmonic and bi-harmonic viscosity) and quadratic bottom drag. Using a quasi-AB-2 scheme instead (as in previous ECCO configurations) would reduce the stability limit from 1 to 0.9 (for pure damping term, with AB = 0.1). AB-3 would reduce it even further to 0.55 and therefore was not considered here. For stability reason also, 20 a backward time stepping is used for the other dissipation term in Eq. (B2) (i.e. D n+1 ⊥,v ) that represents vertical viscosity effects in the interior, except bottom friction: This vertical shear term is independent of the vertically integrated pressure gradient contribution (g∇ z * η n+1 ), so that these two operations commute. This allows to find D ⊥,v n+1 even before knowing η n+1 by solving a tri-diagonal system in each water column. The updated η n+1 is found by combining Eqs. (B2) and (B3) to form a 2-D elliptic 5 Poisson equation for surface pressure (pressure method) which is solved iteratively using the conjugate-gradient method (Marshall et al., 1997). Solver matrix and preconditioner are updated at each time-step as the water column height changes due to the non-linear free-surface (Campin et al., 2004 (Gaspar et al., 1990), as well as isopycnal diffusion (K σ ; Sect. 3.3). The effect of unresolved eddies parameterized as a bolus velocity (v b ) advecting tracers (Gent and Mcwilliams, 1990) parameters, including v b , isopycnal slope (α x , α y ) and vertical diffusivity and viscosity (K ⊥ , ν ⊥ ) are computed at the beginning of the time-step from the current state. Isopycnal diffusivity (K σ ) is discretized as a tensor (Redi, 1982) where all the terms are treated explicitly (i.e., as a function of θ n+1/2 gradient) except for the pure vertical ∂z ) where |α| = (α 2 x +α 2 y ) 1/2 denotes the magnitude of the isopy-20 cnal slope. The pure vertical component is combined with K ⊥ and applied to the future tracer field (θ n+3/2 ) using a backward time-stepping, leading to: Horizontal advection (second term in Eq. B7) uses the 3rd order direct space and time (DST-3) advection scheme (MITgcm Group, 2002;Adcroft et al., 2004b) with the direc-5 tion splitting method (also called multi-dimensional advection) as described in Adcroft et al. (2004b). The tracer field (m = 2) obtained after applying 1-D advection (in X or Y direction) on current tracer (m = 1) is used to compute the advective fluxes in the other direction (Y or X ) and ensures 2nd order accuracy in space and time. Regarding vertical advection, the backward time stepping (unconditionally stable) is applied with 3rd order advection scheme; this involves solving a penta-diagonal system (with some additional contributions from vertical mixing to the 3 main diagonals) for each column.
In particular, this choice alleviates adjoint stability restrictions.

Appendix C: Diagnostics
The MITgcm "diagnostics" package is generally used to generate binary output for 15 offline analysis of the solutions. In the case of the LLC90 grid, a two-dimensional field is thus output as an array of size 90 × 1170. It can easily be re-organized according to Table 11 to match the MITgcm layout of the LLC90 grid (Fig. 12). The state estimate output is made available online in a tiled netcdf format (nctiles) where each tile is a 90× 90 subdivision of a face (i.e. of f1, f2, f3, f4 or f5 in LLC fields, it is only the LL sector that can readily be re-assembled as a single twodimensional array. A simple Matlab script is provided to this end (eccov4_lonlat.m; see Sect. "Code availability"). It is mainly intended for users of earlier non-global ECCO estimates that may want to re-use their old analysis codes. ECCO v4 users are generally advised against interpolating, which introduces errors, and often precludes accurate 5 transport computations. Instead, mimicking the gridded earth decomposition of general circulation models is regarded as the most convenient, robust and general way to carry out offline analyses of the solutions. This approach is readily implemented in Matlab by the gcmfaces toolbox. It defines a class of objects (the gcmfaces class depicted in Table 11) that is a natural extension 10 to the common array class. Basic operators (such as "+") are readily overloaded (i.e. re-defined) for the gcmfaces class. Thus, for example, the addition of two gcmfaces objects can simply be written in the compact and general "fld1 + fld2" form -exactly as if fld1 and fld2 were two array objects. Note that the grid-specific organization of the binary data (e.g. Table 11) does not appear in "fld1 + fld2", which reflects that this compact code is immediately applicable to all supported grids (Fig. 1).
Transport and budget computations are coded with the same degree of generality within gcmfaces. Hard-coding array sizes or exploiting specific grid symmetries (e.g. the zonal symmetry of the LL grid) is excluded, in order to avoid having to re-code the same diagnostics on different grids. Two basic elements are instrumental to the generality 20 of gcmfaces codes, which are worth noting here. Firstly, any transport is computed following a grid line path, as illustrated in Fig. 13. Three types of paths are readily treated in a general fashion: small circles of constant latitude, great circles defined by two points (as shown in Fig. 13), and the edge of a specified subdomain. Secondly, the familiar mechanism 13 by which rows and columns of neighboring faces are appended 25 at the edges of an array (e.g. to f1, f2, f3, f4 or f5 in Table 11) is readily implemented. This yields general codes for gradient, rotational, divergence, etc. computations that are immediately applicable to all supported grids (Fig. 1). From the state estimate output made available online, users can readily re-compute the gcmfaces standard analysis. The standard analysis document serves as a general documentation of the state estimate, and allows for a direct comparison with other MITgcm simulations regardless of the grid-specifics. It proceeds in two steps: diags_driver('release1/','release1/mat/',1992:2011); 5 diags_driver_tex('release1/mat/',{},'release1/tex/standardAnalysis');
Diagnosing mass, heat and salt budgets requires snapshots of the 10 ocean + seaice + snow model state (to compute the tendency terms), as well as time averaged fluxes between snapshots (to match the tendency terms). The MITgcm flux output accounts for variations of layer thicknesses in z * coordinate. Tendency terms are computed after the fact using snapshots of e.g. η and θ (Sect. 3.1). The assembled mass, heat and salt budgets are provided online in the extensive form (in 15 kg s −1 , J s −1 , g s −1 respectively) and in nctiles format (monthly, three-dimensional).
The budgets residuals are less than 10 −6 times the budget magnitude (a Euclidean norm is used). Here "mass budget" simply denotes the constant Boussinesq density ρ c times volume -in contrast with the hydrostatic pressure budget that is most directly relevant to diagnosis of sea level variability (Forget and Ponte, 2015

Appendix D: Profiles
The MITgcm "profiles" package subsamples the model solution, while it is being computed, at the locations and times of observed in situ profiles. At model initialization, observed profiles dates and locations are read from MITprof files (see below) and each profile is allocated to the processor corresponding to its sub-domain tile. The latter is 5 generally facilitated by a pre-processing step: observed profiles are collocated with grid points using gcmfaces (see Appendix C) and grid locations added to the MITgcm input files. During model integration, profiles are sampled at time steps and locations closest to observations, vertically interpolated to the MITprof depth levels, and written to file. At the end of the forward model integration, these profiles are re-read from file along 10 with observed and weight profiles, and the normalized distance between modeled and observed profiles is computed (see Sect. 4). MITprof files contain in situ profiles (prof_T and prof_S) as well as corresponding state estimate profiles (prof_Testim and prof_Sestim) and least square weights (prof_Tweight and prof_Sweight) as illustrated in Fig. 15. Weights are set according The MITprof format contains a limited amount of ancillary information: profile locations, dates, and an identifying code (prof_descr). This choice, along with the use of standard depth levels, yields data sets that are both more compact and simpler than 25 most data center formats (e.g. the Argo format), providing easy access to vast collections of profiles of various origins ( (e.g. for shipboard CTDs) or an instrument ID (e.g. for Argo profiles). They are informative of the data origin, and used for analyses of transects or time series. As part of the MITprof Matlab toolbox, the pre-processing of in situ profiles consists of four basic steps: (1) applying relevant data quality flags, if provided by data center, (2) converting in-situ to potential temperature or pressure to depth, if needed, (3) inter-5 polating to standard depth levels 14 , (4) resetting weights to 0 for standard levels that are not closest neighbors to observed levels, for S outside the 25-42 range, and when jT (resp. jS) exceeds 50 (i.e. 7 standard errors) when computed for an Argo-based atlas (Forget, 2010). Zero weights thus indicate suspicious data points that users are advised to discard.

Appendix E: Smooth
The MITgcm "smooth" package is an implementation of recipes presented in detail by Weaver and Courtier (2001). At the core of the method, a diffusion equation is time integrated to smooth a field. Applying the smoother directly (without additional factors) to model-data misfits (as part of P in Eq. 13) yields a practical method to omit scales 15 at which observations and models are not expected to be consistent with each other. This approach is useful, for example, to constrain eddying models to coarse grained climatological fields, or to constrain models with along-track altimetric data (Forget and Ponte, 2015).
When the smoother is applied to uncorrelated grid scale noise, the resulting fields 20 have a Gaussian correlation (Fig. 14) with a e-folding scale L determined by the joint specification of integration time and diffusivity. The noise amplitude reduction by the smoother (Fig. 14,  Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | tial correlation operator that conserves variance (in the case of uncorrelated noise). A spatial covariance operator is then immediately obtained by further multiplying the normalized smoother with a specified error field, and grid cell areas or volumes are used as a preconditioner (in two-and three-dimensional cases respectively), following Weaver and Courtier (2001). 5 This method is used for all control parameter covariances (see Sect. 4.4;Q in Eq. 15). Key advantages of this method are that it is matrix free, naturally handles coast lines, and easily accommodates a variety of grids. In practice, "smooth" also damps grid scale noise that can arise from the adjoint model, and it thus facilitates optimization.

Appendix F: Benchmarking
While MITgcm evolves continuously its results are tested against benchmarks on a daily basis, with a variety of compilers, on a variety of computing platforms. These tests are carried using the "CVS" and "testreport" capabilities for short runs (a few time steps), on a small number of processors (or just one), and exclude optimization by com- 15 pilers. This design is suited to detect mistakes in code revisions and distinguish them from false positives (associated with truncation errors). The ECCO v4 model setup (Sects. 2 and 3) takes full advantage of that framework, which makes it both portable and stable (Sect. "Code availability").
Advanced usage of ECCO v4 may include re-running forward model solutions (the 20 state estimate in particular) or its adjoint. Computational requirements are modestthe 20 year forward model integration typically takes between 6 and 12 h on 96 processors. ECCO v4 users can thus easily re-run the state estimate solution to generate additional output and carry out analyses that may not already be covered by the publicly distributed material. Running the adjoint model allows for analyses of processes 25 and mechanisms (e.g. see Fukumori et al., 2015) as well for the possibility of further optimization of the state estimate.

GMDD Introduction
Conclusions References Tables  Figures   Back  Close Full Screen / Esc Printer-friendly Version

Interactive Discussion
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | While the "testreport" tool is very useful and practical, it does not directly apply to the state estimate, but rather to the underlying model code and setup. An extension to the benchmarking framework is therefore proposed that is suited for the full state estimate solution. The benchmarking tool is implemented as a self-contained Matlab routine (testreport_ecco.m). It relies upon distances to observations and monthly mean model output (Table 2). It provides a simple mechanism that allows users to verify that their 20 year solution is acceptably close to the released state estimate. The first three lines of Table 3 are reflective of small differences that user should expect when re-running the state estimate using a different computer or an updated MITgcm code. Such slight changes typically result form compiler optimization of slightly different 10 codes and slightly different arithmetic and MPI libraries.
For any given model run, model-data distances are simply read from a summary text file (typically named cost function0011) that MITgcm generates at the end of the model integration. Benchmark values are then read from a Matlab file (typically name testre-port_release1.mat) and relative differences are reported as shown in Table 3. The other 15 tests are slightly more computationally intensive as they read binary output of model fields -a subset of the fields that are distributed online as nctiles files (Appendix C). It was chosen to focus on integrated quantities (global means and transports) that are known to be model sensitive (e.g. see Table 3) and of common interest to ECCO users. Computations of monthly global mean free surface height, temperature and salinity il-20 lustrate usage of grid cell surfaces and volumes. If gcmfaces is activated in Matlab (see Appendix C and Sect. "Code availability") then a benchmarking of integrated transports can also be performed.

Appendix G: Solution history
The ECCO-Production, release 1 state estimate was produced in several phases over 25 the course of the ECCO v4 development. In total, 45 iterations were performed, and a summary of the different phases is provided below. We should stress that the docu- mented solution history reflects the progressive development of ECCO v4 -as opposed to a systematic or advocated approach to the optimization of model solutions.
The first series of 14 adjoint iterations was carried (with the MITgcm's checkpoint62k) using a non-synchronous time step (3 h for tracers, and 20 min for momentum), sea surface salinity relaxation to climatological values, and the linear free surface method. 5 Revision 1 was the switch to the one hour time step (for both tracers and momentum) and to the non linear free surface, followed by 14 adjoint iterations (with checkpoint62y).
In revision 2, the Duffy et al. (1999) parameterization as implemented for the MITgcm by Nguyen et al. (2009) was added, the solution was extended through 2011, and 13 more adjoint iterations were carried out (with checkpoint63g). In revision 3, the surface 10 salinity relaxation was removed and its effect replaced by an adjustment of precipitation controls, followed by 3 adjoint iterations (with checkpoint63r). The resulting solution is used in Speer and Forget (2013); Wunsch and Heimbach (2013a, b);Buckley et al. (2014);Balmaseda et al. (2015). In revision 4, the adjustment of precipitation from revision 3 was removed, followed by 8 adjoint iterations (with checkpoint64f). 15 Up to this point (revision 4, iteration 8) time-variable global mean sea level had been omitted from the altimetric constraints -letting the other data constraints (from in situ hydrography, SST and regional altimetry primarily) determine the solution variability. Then, revision 4 iteration 9 consisted in estimating a time variable global mean precipitation adjustment under the sole constraint of fitting the time-variable global mean 20 altimetry. This operation had very little influence on the rest of model-data misfitsconsistent with the analysis presented in Sect. 5.1. This solution is used in Forget and Ponte (2015).
Revision 4 iteration 10 consisted in a filtering of atmospheric control parameters adjustments to reduce irregularities in the forcing that had appeared during adjoint 25 iterations. To further reduce dynamical imbalances during the first years of integration, the initial state of 1 January 1992 as adjusted during the adjoint iterations was replaced with the state of 1 January 1995. This solution is used in Wunsch and Heimbach (2014). Finally, revision 4 iteration 11 consisted in a reduction of vertical viscosity Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | to 5 × 10 −5 m 2 s −1 to reduce a low bias in the Equatorial Undercurrent velocity, albeit with little impact on model-data misfits. Revision 4 iteration 11 is the ECCO-Production, release 1 state estimate, which originally ran with MITgcm's checkpoint64t. For benchmarking purposes (Appendix F) the 20 year solution is re-run once a month with the up-to-date MITgcm. As of MITgcm's 5 checkpoint65i, it matches the original solution within the precision seen in Table 3 (top three rows).

Code availability
The MITgcm is developed and maintained within Concurrent Versions System (CVS). This framework allows users to download frozen versions of the model code (check-oint65i at the time of writing) or to maintain their local copy up to date. The evolving code is benchmarked on a daily basis using the "testreport" capability (Appendix F). Documentation for the MITgcm itself, the CVS framework, and the "testreport" capability can respectively be found at: http://mitgcm.org/public/r2_manual/latest/online_documents/manual.pdf 15 http://mitgcm.org/public/using_cvs.html http://mitgcm.org/public/devel_HOWTO/devel_HOWTO.pdf The ECCO v4 model setup (Sects. 2 and 3) exploits the MITgcm CVS and testreport capabilities, to allow any interested user to obtain the up-to-date setup and re-run the short ECCO v4 benchmark (Appendix F). Results of the automated daily benchmarking Introduction

1025
When the smoother is applied to uncorrelated grid scale noise, the resulting fields have a Gaussian correlation (Fig. 14) with a e-folding scale L determined by the joint specification of integration time and diffusivity. The noise amplitude reduction by the smoother (Fig. 14, color scale) can be computed exactly or approximately (Weaver and Courtier, 2001). Normalizing the smoother to account for this effect yields a spatial correlation operator that conserves variance (in the case of uncorrelated noise).