Ocean models must be regularly updated through the assimilation of observations (data assimilation) in order to correctly represent the timing and locations of eddies. Since initial conditions play an important role in the quality of short-term ocean forecasts, an effective data assimilation scheme to produce accurate state estimates is key to improving prediction. Western boundary current regions, such as the East Australia Current system, are highly variable regions, making them particularly challenging to model and predict. This study assesses the performance of two ocean data assimilation systems in the East Australian Current system over a 2-year period. We compare the time-dependent 4-dimensional variational (4D-Var) data assimilation system with the more computationally efficient, time-independent ensemble optimal interpolation (EnOI) system, across a common modelling and observational framework. Both systems assimilate the same observations: satellite-derived sea surface height, sea surface temperature, vertical profiles of temperature and salinity (from Argo floats), and temperature profiles from expendable bathythermographs. We analyse both systems' performance against independent data that are withheld, allowing a thorough analysis of system performance. The 4D-Var system is 25 times more expensive but outperforms the EnOI system against both assimilated and independent observations at the surface and subsurface. For forecast horizons of 5 d, root-mean-squared forecast errors are 20 %–60 % higher for the EnOI system compared to the 4D-Var system. The 4D-Var system, which assimilates observations over 5 d windows, provides a smoother transition from the end of the forecast to the subsequent analysis field. The EnOI system displays elevated low-frequency (

The predictive performances of two ocean data assimilation systems (EnOI and 4D-Var) are assessed in a Regional Ocean Modeling System (ROMS) configuration of the East Australian Current over 5 d forecast horizons.

The forecast skill of the 4D-Var system surpasses the EnOI system against both assimilated and independent observations at the surface and subsurface.

The EnOI system has greater analysis increments, elevated low-frequency (

The dynamically balanced 4D-Var system displays elevated energy in the near-inertial range throughout the water column, with the wavenumber kinetic energy spectra remaining unchanged upon assimilation.

Data assimilation (DA), the combination of numerical modelling and observations, is essential to produce accurate forecasts of the atmosphere or ocean circulation. The goal of any DA scheme is to combine observations and a numerical model such that the result is a better estimate of the ocean circulation than either alone. Observations provide sparse data points, while the model provides context. Since initial conditions play an important role in forecast quality, accurate and dynamically consistent state estimates are key to improving prediction. This study focuses on the comparison of two DA techniques applied to forecasting the ocean mesoscale circulation in a highly dynamic oceanic region.

Mesoscale eddies exist throughout the global ocean and contain more than half of the kinetic energy of the ocean circulation. Western boundary current (WBC) regions are hotspots of high eddy variability as eddies emerge due to instabilities in the strong boundary current flow. The high mesoscale eddy variability

The East Australian Current (EAC), the WBC of the South Pacific subtropical gyre (Fig.

There are various DA techniques, by which a model estimate of the ocean state can be combined with ocean observations, that vary in complexity. Simpler, computationally efficient, time-independent methods such as 3-dimensional variational data assimilation (3D-Var) and ensemble optimal interpolation (EnOI) centre the observations and model on a single time and are capable of resolving slowly evolving flows governed by simple balance relationships at synoptic scales. These methods have provided useful state estimates and predictions. For example, the European Centre for Medium-Range Weather Forecasts uses 3D-Var to produce initial conditions for its coupled ocean–atmosphere modelling system

With increasing computational capacity and the pursuit of more accurate weather and ocean forecasts over the last 2 decades, a shift has been made to more advanced, time-dependent DA techniques

In 4D-Var the model and observations are combined using subsequent iterations of the tangent linear and adjoint models to compute increments in the forecast model (initial conditions, boundary conditions, and surface forcing) such that the difference between the new model solution and the observations is minimised over a time window

Indeed, with the shift to more advanced DA techniques in ocean forecasting, it is important to quantify the improvements gained. Here we use a Regional Ocean Modeling System (ROMS) configuration of a dynamic WBC (the EAC) to compare two DA methods in a quantifiable manner. We compare the time-independent DA technique (EnOI) with the time-dependent technique (4D-Var) using the same numerical model configuration and suite of observations. We quantify the differences in predictive skill achieved by the two systems against assimilated and independent observations at the surface and subsurface. We focus our analysis on the performance of the short-range (5 d) forecasts. After presenting the experiments (Sect.

We use the Regional Ocean Modeling System (ROMS) to simulate the eddying ocean circulation off the southeastern coast of Australia between January 2012 and December 2013. This modelling suite is named the South East Australian Coastal Ocean Forecast System (SEA-COFS,

The study domain covers SE Australia from 25.25 to 41.55° S and approximately 1000 km offshore (Fig.

Initial conditions and boundary forcing are derived from the Bluelink ReANalysis version 3 (BRAN3;

The free-running configuration, while unable to reproduce the temporal evolution of the mesoscale eddies, has been shown to accurately represent the mean dynamical features of the EAC and both the surface and subsurface (0–2000 m) variability

The same set of observations are assimilated into the ROMS model configuration using the two DA systems (EnOI and 4D-Var) for comparison in this study. These include satellite-derived SSH, SST, sea surface salinity (SSS), vertical profiles of temperature and salinity from profiling Argo floats, and vertical profiles of temperature from expendable bathythermographs (XBTs) (refer to Fig.

Archiving, Validation and Interpretation of Satellite Oceanographic Data (AVISO), France, produces global, daily, gridded (

We use the gridded AVISO product to constrain SSH, rather than the along-track altimetry, for this comparison study. Current work including the development of a high-resolution coastal ocean forecast system

SST data from the US Naval Oceanographic Office's Global Area Coverage Advanced Very High Resolution Radiometer level-2 product (NAVOCEANO's GAC AVHRR L2P SST) are used for this study. Data are available 2–3 times per day. We remove day-time SST observations and any night-time observations when wind speed

We use the Level-3 gridded sea surface salinity (SSS) product derived from the National Aeronautics and Space Administration (NASA) Aquarius satellite (

Argo (free-drifting profiling) floats measure temperature and salinity of the upper 2000 m of the global ocean (

Expendable bathythermographs (XBTs) collect temperature profiles along repeat lines sampled by merchant ships; the Sydney–Wellington (PX34) and the Brisbane–Fiji (PX30) routes intersect our model domain (Fig.

A suite of additional observations were also available over the simulation period (2012–2013) that were collected as part of Australia's Integrated Marine Observing System (IMOS). These include surface velocity measurements from high-frequency coastal radar (HF radar); temperature, salinity, and velocity observations from continental-shelf moorings off the coast of New South Wales (NSW) and South East Queensland (SEQ); temperature, salinity, and velocity observations from five deep-water moorings across the core of the EAC at 28° S (EAC array); and temperature and salinity observations from ocean gliders (refer to Fig.

In this paper, we refer to three different configurations of the SEA-COFS model which differ in DA type and/or the observations assimilated. Each case is performed over the 2-year period from January 2012 and December 2013 and is described below.

4D-Var TRAD refers to the 4D-Var system that assimilates “traditionally” available observations (SSH, SST, SSS, Argo, and XBT). This system is similar to the system described in

EnOI TRAD refers to the system that assimilates the same observations as the 4D-Var TRAD but using the EnOI DA method described in Sect.

4D-Var FULL refers to the 4D-Var system that assimilates all available observations (SSH, SST, SSS, Argo, XBT, HF radar, shelf and deep moorings, and glider data). It is similar to the system described in detail in

A detailed comparison of the 4D-Var TRAD and the FULL systems was presented in

The classic state estimation problem can be given by

For time-dependent methods (4D-Var and EnKF), observations are assimilated over a time window respecting the dynamics of the model. The observation operator

Ensemble methods (which include the time-dependent EnKF and the time-independent EnOI) use an ensemble of model anomalies to estimate the background error covariances. The EnKF allows for the time-varying statistics by using a fixed number of nonlinear model members (ensembles) to provide a statistical representation of

A challenge of ensemble methods is to determine the sufficient number of ensemble members to capture the entirety of the state space, and techniques such as localisation and inflation are used to ensure unrealistic covariances are not applied

For EnOI, there is no time dependence in

In the EnOI system used in this study, we use a stationary ensemble to represent the intraseasonal model anomalies. Each member is calculated as a difference between a 2-week model average and a 2 d average, centred at the same time. This is repeated every 30 d to ensure the anomalies are independent, generating 266 ensemble members. The DA system is run with a 1 d cycle and centred observation window, so an analysis is generated every day. For SSH, temperature, and salinity, the observation time is assumed to coincide with the analysis time, and innovations are calculated as the difference between observation and model state at the analysis time. The localisation method applied is based on local analysis

For comparison with the 4D-Var system we perform 5 d forecasts based on the EnOI analyses every 4 d. Initial conditions for each subsequent 5 d forecast are taken from the EnOI analysis. In this paper we focus on the forecast skill between the 4D-Var and EnOI systems (not the analysis skill).

The 4D-Var system uses variational calculus to solve for increments in model initial conditions, boundary conditions, and forcing such that the differences between the observations and the new model trajectory are minimised – in a least-squares sense – over a specific assimilation window. The goal is for the model to represent all of the observations in time and space using the physics of the model and accounting for the uncertainties in the observations and background model state, producing a description of the ocean state that is dynamically balanced and a complete solution of the nonlinear model equations.

This is achieved by minimising an objective cost function,

We seek to minimise the cost function by equating the gradient to zero. The gradient of the cost function is given by

In practice, with 4D-Var, subsequent integrations of the adjoint and tangent linear models are performed to solve for an increment vector that minimises (or acceptably reduces)

To solve for the nonlinear ocean solution that better represents the observations, we must take into account the uncertainties in the system. As such, the background (prior model) error covariance matrix,

Correlation lengths assumed for the control vector elements: 4D-Var system.

Because we use the linearised model equations, the assimilation window length is limited by the time over which the tangent linear assumption remains reasonable (although longer windows have been shown to produce useful results). For the 4D-Var system presented in this study, we find that a 5 d assimilation window is reasonable. We adjust the model initial conditions, boundary conditions, and surface forcing such that the new model solution (the analysis) better represents the observations over the assimilation interval. Open boundary conditions are adjusted every 12 h and surface forcing every 3 h. A 5 d analysis is generated every 4 d (that is, there is a 1 d overlap between the analyses). Initial conditions for the subsequent 5 d forecast are taken from day 4 of the previous analysis. The ROMS 4D-Var formulation and implementation is well described by

As discussed above, the way by which the observations and the model background are combined to generate the analysis is quite different for the 4D-Var and EnOI methods. Another significant difference is the computational expense. For the 15 inner loops and single outer loop used in this study, the 4D-Var data assimilation process is approximately 50 times more expensive than a single free run, making it 25 times more expensive than the EnOI system (once the stationary ensemble has been generated).

This is comparable to the expense of an EnKF using 50 ensembles. The advantage of EnKF (over 4D-Var) is that the tangent linear and adjoint models are not required, all calculations are performed in nonlinear space, and the ensemble members can be run simultaneously if sufficient computing resources are available. The drawback is underdispersion of the ensemble and the loss of dynamic consistency introduced through localisation and inflation. With a 4D-Var system, the use of the adjoint model can provide useful insight into the sensitivity of the ocean state to prior changes in state variables or forcings (e.g.

Future work aims to compare the EnKF and 4D-Var methods and explore hybrid ensemble–4D-Var methods that capitalise on the advantages of both (i.e. the dynamical interpolation properties of the adjoint used in 4D-Var and the explicit flow-dependent error covariances of the EnKF

We begin by assessing the performance of the EnOI and 4D-Var systems relative to the observations that the systems assimilate. The 5 d model forecast is compared to the observations that become available over those 5 d (that is, they have not yet been assimilated) to quantitatively assess the performance of the model forecasts over time. Comparing forecasts against observations provides objective assessment of the system performance.

Table

Summary of performance of the EnOI and 4D-Var systems. Obs num refers to the average number of observations per 5 d assimilation window.

The performance of the two systems relative to SSH, SST, and Argo observations is presented in more detail using the root-mean-square difference (RMSD) between the model forecasts at the observation locations and the observation values. Figure

As each forecast is initialised from the previous analysis, forecast errors typically increase over the forecast horizon. SSH forecast errors are averaged across the model domain (Fig.

In a similar manner to the SSH forecast errors in Fig.

Same as Fig.

To assess the subsurface predictive skill, we extract the 5 d model forecast values at the observation times and locations for all Argo floats that were observed in the region over the forecast window. Binning these observations with depth, we present profiles for temperature and salinity of the mean (Fig.

As described in Sect.

Under the HF radar footprint at 30° S, surface radial velocity observations from two sources are combined to compute surface velocities to about 100 km offshore, covering the shelf and shelf slope circulation. This coverage typically includes the EAC as a coherent jet and the intermittent formation of cyclonic frontal eddies inshore of the EAC

Complex correlation of daily averaged surface velocities measured by the HF radar with FULL analysis (row

Glider data over the study period (2012–2013) were predominantly available over the NSW continental shelf in water depths

Errors are lowest near the surface compared to over the thermocline region due to the assimilation of SST and SSS data in all three systems (4D-Var TRAD, EnOI TRAD, and 4D-Var FULL). The 4D-Var TRAD has rms forecast errors for temperature of a similar magnitude and depth structure as the rms observation anomalies, and the errors do not considerably change from day 1 to day 5 of the forecast window. The EnOI errors are of similar magnitude to the 4D-Var near the surface (

For salinity, the 4D-Var and EnOI display similar forecast errors in the upper 200 m. This depth range corresponds to where the many shelf glider observations exist. Below 200 m (the off-shelf missions into the Tasman Sea), forecast errors peak at 300 m reaching 0.30 for EnOI at day 5, compared to 0.23 for 4D-Var. Similar to the Argo-observed salinity (Fig.

Subsurface velocities are measured by acoustic Doppler current profilers mounted on moorings in the EAC array, the SEQ shelf and slope, and on the NSW shelf (Fig.

As shown in both Figs.

Complex correlations between observed and modelled velocities for the 4D-Var TRAD forecast, the EnOI TRAD forecast, the FULL analysis, and the FULL forecast, at selected mooring locations, separated by window days 1, 3, and 5 (columns). Each row represents a single mooring site: EAC2 (row

We have shown that the 4D-Var TRAD system outperforms the EnOI TRAD system at the surface and subsurface when compared against both assimilated and independent observations. Improvements to temperature forecasts with 4D-Var are more pronounced in the subsurface (the upper

The model forecast,

The discontinuities presented here do not exactly correspond to the analysis increments. We have presented the differences in the ocean state between day 4 of the previous (5 d) forecast and the beginning of the subsequent forecast (which correspond to concurrent times). For 4D-Var, the ocean state at the beginning of the forecast is taken from the previous cycle analysis, and so the difference presented here represents the difference between the forecast (or the background) at day 4 and analysis at day 4 (once data assimilation has been performed on that assimilation cycle). This is essentially the “analysis increment at day 4”. However for a 4D-Var system the analysis increments typically refer to the adjustments to the initial conditions, boundary forcing, and surface forcing that are made to generate the analysis. For EnOI, the analysis increments refer to the difference between the background model and the analysis (both centred on a single time and computed daily in this case). However, here we take the analyses every 4 d and perform 5 d forecasts, and the differences presented here refer to the difference between day 4 of the forecast and the analysis that provides initial conditions for the subsequent forecast.

With 4D-Var we are able to represent the entirety of the observations collected over a time window (in this case 5 d), placing them in dynamical context using the (linearised) model equations. In contrast, EnOI performs discrete minimisations with observations centred on a single time (in this case every day). The estimate of the ocean over the observation window that is created with the 4D-Var assimilation system results is smaller discontinuities between forecast cycles, on average, compared to the EnOI system, as a continuous field evolves by the nonlinear primitive equations as opposed to starting a forecast from a discrete estimate, which can “shock” the system. Our results of the improved predictability achieved by the 4D-Var system support the understanding that a continual and dynamically balanced analysis field is advantageous to the quality of future predictions.

Root-mean-squared difference between the initial conditions (from the analysis) and the previous forecast field at that time for

The modelled velocities are used to compute eddy kinetic energy (EKE) and mean kinetic energy (MKE) over the 2012–2013 simulation period. MKE is given by

Comparisons of MKE above 400 m show that the EAC core is narrower and more confined to the slope in the 4D-Var system, while MKE for the EnOI system is more spread out and with higher MKE directly over the continental shelf (Fig.

The spatial structure of the EKE is similar across the two systems. Above 400 m, the EnOI system has elevated EKE over the EAC jet (Fig.

The 4D-Var simulation

Eddies can form through barotropic instability in the mean flow or baroclinic instability in the vertical density structure. It is important for a model to correctly represent these instabilities, as they represent the pathways by which eddies are generated. Following

Barotropic and baroclinic energy conversions are computed from the model forecast fields and averaged over the 2-year period (Fig.

For the 4D-Var simulation, the

When observations are assimilated the goal is to provide an improved fit to the observations while retaining a dynamically consistent ocean state that can be used as initial conditions for the subsequent forecast. The background numerical model produces an estimate of the ocean state whose frequency and wavenumber spectra are limited by the resolution of the model and the processes resolved. If the observations sample time and space scales that cannot be resolved by the model, it is standard DA practice to either remove these scales of variability from the observations or account for them in the observation error terms (e.g.

The subsurface structure of the model fields and their variability is shown in Fig.

Column 1: mean temperature across all 5 d forecasts for sections 28° S (top panels) and 34° S (bottom panels) for the 4D-Var system, the EnOI system, the difference in mean temperature between the systems (4D-Var – EnOI), the difference in mean temperature between 4D-Var and the Free-run, and the difference in mean temperature between EnOI and the Free-run. Column 2: temperature variability for the 4D-Var system, the EnOI system, the difference in variability between the systems (4D-Var – EnOI), the difference in variability between 4D-Var and the Free-run, and the difference in variability between EnOI and the Free-run. Temperature variance is computed for every 5 d forecast, averaged over all forecast windows, and the square root taken. Column 3 shows the same as column 1 but for alongshore velocity. Column 4 shows the same as column 2 but for alongshore velocity. The 4D-Var – EnOI variability panels show points chosen to present frequency spectra (Figs.

Frequency spectral analysis is first performed for all 5 d forecast windows and then averaged (Fig.

The differences between EnOI and the Free-run and EnOI and 4D-Var (as revealed in Fig.

The elevated energy in the EnOI system compared to 4D-Var and the Free-run relates to periods greater than 1 d for both temperature and velocity (Fig.

Frequency spectra in model space for temperature and alongshore velocity at the surface, 400 m, and 1000 m at 28 and 34° S. Spectra are computed for each 5 d forecast window and then averaged. Points are chosen in the core of the EAC based on the long-term alongshore velocity mean (from

Frequency spectra for the same variables and points shown in Fig.

The spatial scales of the forecast ocean state can be represented by wavenumber spectra. Here we present cross-shore wavenumber kinetic energy spectra through sections at 28 and 34° S (Fig.

At the surface all systems, except the EnOI at day 1, display consistent kinetic energy spectra at 28° S. The AVISO velocities show less energy at spatial scales between 15–80 km compared to the Free-run, the 4D-Var system across all forecast days, and the EnOI system at day 5. At 34° S, where eddy variability is high, the Free-run underrepresents the kinetic energy across all spatial scales at all depths. At the surface, the 4D-Var system across all forecast days and the EnOI system at day 5 represent the AVISO spectrum well, with the AVISO velocities again showing slightly lower energy at spatial scales between 15–80 km.

For the first day of the EnOI forecasts (representative of the analyses), there is elevated kinetic energy at finer length scales and this energy dissipates by day 5 of the forecast. This elevated energy is most pronounced at the surface and near-surface (upper 200 m, not shown). Specifically, elevated kinetic energy exists in the EnOI initial states at length scales less than 100 km at 28° S and between 20–80 km at 34° S. For the 4D-Var system the wavenumber kinetic energy spectra remain relatively unchanged over the forecast window, with the day 1 and day 5 wavenumber spectra tracking closely. Compared to the Free-run, both the 4D-Var and EnOI assimilation systems introduce more kinetic energy across all spatial scales throughout the water column in the eddy-dominated region (illustrated by the sections through 34° S in Fig.

We include the idealised spectral slopes of

Cross-shore wavenumber kinetic energy spectra for the models at the surface, 400 and 1000 m and for AVISO geostrophic velocities at the surface, and at 28 and 34° S. The length scales 200, 100, and 20 km are shown by the vertical dashed lines. The

We have shown that energy is elevated for shorter (less than 100 km) length scales in the EnOI analyses, and upon integration of the forecast model this energy dissipates to match the energy associated with the 4D-Var system. Wavenumber kinetic energy analysis of the atmosphere by

This study shows in a quantified manner that the smoother and more dynamically balanced fit between the observations and the model's time-evolving flow achieved by the 4D-Var system results in improved predictability against both assimilated and non-assimilated observations. The EnOI system does not produce as tight as fit to the SSH data as the 4D-Var system (although this may be related to tuneable parameters in the DA formulation); however, the SSH error grows at the same rate in the EnOI and 4D-Var forecasts (Fig.

Independent surface velocity observations as measured by the high-frequency radar array at 30° S are less well represented by the EnOI system compared to the 4D-Var system from day 1 through to day 5 of the forecasts (Fig.

The EnOI system displays greater discontinuities between the end of the forecast and the subsequent analysis, particularly for near-surface temperature (about the thermocline), and the discontinuities have greater magnitude in the downstream eddy-dominated region (Fig.

This study chose to compare two DA methods across a common modelling framework and observational network. The two methods were chosen as EnOI has been widely used by the Australian ocean forecasting community

The EnOI system is

Our future work specifically aims to directly address the need to improve predictive skill in WBC regions. Time-independent schemes (e.g. 3D-Var and EnOI) are useful for intermittent cycling DA at synoptic scales and are capable of resolving slowly evolving flows governed by simple balance relationships. Time-dependent DA methods (e.g. 4D-Var and EnKF) are greatly beneficial for highly intermittent flows with irregularly sampled observations as the time-variable dynamics of the model are used to evolve the error covariances. Furthermore, these methods allow the entirety of observations over a time interval to be minimised rather than discrete minimisations. The time-evolving state is required to truly exploit many novel observation types that are nonlinearly or indirectly related to the model state. Indeed, the two techniques that are the most promising in NWP and ocean DA are 4D-Var and EnKF

The ROMS model code is available from

The observations were sourced from the Integrated Marine Observing System (IMOS). IMOS is a national collaborative research infrastructure, supported by the Australian Government (

Argo data were collected and made freely available by the international Argo programme and the national programmes that contribute to it. (

We acknowledge AVISO for the delayed-time SLA data. The Ssalto/Duacs altimeter products were produced and distributed by the Copernicus Marine and Environment Monitoring Service (CMEMS) (

CGK developed the ROMS model configuration of the EAC system, processed the observations, and developed the 4D-Var DA configuration. CGK performed the 5 d forecasts, given the EnOI analyses. CGK analysed the results to produce Figs. 1–8 and 11–14. DG produced Figs. 9–10. CGK wrote the manuscript with some original input from AS. AS generated the results in Table 1. We acknowledge Pavel Sakov, who generated the EnOI analyses, given the ROMS model configuration and the processed observations from CGK. MR, SK, GB, and JMACS provided useful guidance and input into the scope of the project and interpretation of results.

The contact author has declared that none of the authors has any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

For this research, David Gwyther and Adil Siripatana were partially supported by the Australian Research Council Industry Linkage grant no. LP170100498 to Moninya Roughan, Colette Gabrielle Kerry, and Shane Keating. Prior model development was supported by the Australian Research Council grant nos. DP140102337 and LP160100162. CSIRO Marine and Atmospheric Research and Wealth from Oceans Flagship Program, Hobart, Tasmania, Australia, provided BRAN2020 output for boundary conditions.

This research has been supported by the Australian Research Council (grant nos. LP170100498, DP140102337 and LP160100162).

This paper was edited by Deepak Subramani and reviewed by two anonymous referees.