Data assimilation of in-situ and satellite remote sensing data to 3D hydrodynamic lake models

. The understanding of lakes physical dynamics is crucial to provide scientifically credible information for ecosystem management. We show how the combination of in-situ data, remote sensing observations and three-dimensional hydrodynamic numerical simulations is capable of delivering various spatio-temporal scales involved in lakes dynamics. This combination is 15 achieved through data assimilation (DA) and uncertainty quantification. In this study, we present a flexible framework for DA into lakes three-dimensional hydrodynamic models. Using an Ensemble Kalman Filter, our approach accounts for model and observational uncertainties. We demonstrate the framework by assimilating in-situ and satellite remote sensing temperature data into a three-dimensional hydrodynamic model of Lake Geneva. Results show that DA effectively improves model performance over a broad range of spatio-temporal scales and physical processes. Overall, temperature errors have been 20 reduced by 54 %. With a localization scheme, an ensemble size of 20 members is found to be sufficient to derive covariance matrices leading to satisfactory results. The entire framework has been developed for the constraints of operational systems and near real-time operations (e.g. integration into meteolakes.ch). to lakes and DA for inland


Introduction
The management of aquatic systems is a complex challenge including many stakeholders pursuing sometimes contradictory 25 objectives. This becomes even more complex in view of climate change, affecting both heat and hydrology in the watershed of lakes. There is thereby an urgent need to provide accurate information of the lake hydrodynamics.
Traditionally, perhaps due to the misleading definition of lakes as lentic systems, hydrodynamics studies focused on the onedimensional vertical structure of lakes using in-situ measurements with limited spatial and temporal coverage (Kiefer et al., 2015). Yet, the lentic definition of lakes is misleading at short time scale. Dynamical processes such as wind-induced 30 upwellings, rivers intrusion and gyres strongly disrupt the spatial homogeneity and quiet nature of the systems and ultimately Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2019-47 Manuscript under review for journal Geosci. Model Dev. Discussion started: 21 March 2019 c Author(s) 2019. CC BY 4.0 License.
The aim of this study is to provide a flexible framework, in a Bayesian inference setting, capable of updating and improving model states while taking into account the uncertainty of both the modelled system and observational data. Here, we present a novel DA experiment with EnKF tailored to lakes and observations using an open-source hydrodynamic model and assimilation platform. This approach uses a new file-based coupling recently developed for OpenDA and Delft3D-FLOW with z-layer support (Baracchini et al., 2019). Delft3D-FLOW is an open-source multi-dimensional hydrodynamic simulation 5 software with numerous successful applications in coastal, river, estuarine and lake domains. OpenDA is an open-source generic DA environment (El Serafy et al., 2007), used in various calibration and DA experiments (El Serafy et al., 2007;Weerts et al., 2010;Kurniawan et al., 2011), but not yet to 3D lakes hydrodynamic modelling with DA. Our methodology is tested on a large French-Swiss lake (Lake Geneva) with in-situ temperature measurements and Lake Water Surface Temperature (LSWT) retrieved from satellite data (AVHRR). The choice of testing a first DA of surface temperature on Lake 10 Geneva was motivated by recent studies concluding that data from space-borne medium resolution radiometers specifically tailored to Lake Geneva (Oesch et al., 2005) could potentially be assimilated to numerical models (Oesch et al., 2008).
Furthermore, Baracchini et al. (2019) proposed a calibrated model and framework for Lake Geneva. This first step being an absolute requirement for DA. Here, LSWT and in-situ data are blended into such model, to expand its monitoring capabilities of physical phenomena. The latter is achieved by considering the stochasticity of the system and an EnKF algorithm to update 15 model results. This procedure is expected to benefit both environmental research and operational monitoring and forecasting of mid-size and large lakes, with impacts on a broad diversity of societally important issues.
The study is articulated according to the following: Section 2, data and methods, describes the study site, model, tools and data used. This includes measurements retrieval and processing chain as well as the quantification of their uncertainty. Although part of the methods, the data assimilation algorithm and its configuration are provided in a different section (Section 3) due to 20 their central role in the study. Noise generation, number of ensembles, and localization scheme are discussed in this section.
Section 4 and 5 consist of respectively the presentation and discussion of results. Finally, perspectives and conclusion are given in the final Section.

Data and methods
In this section we describe the various components used in the DA experiment, the challenges associated with high frequency 25 and resolution measurements, modelling datasets, and their errors definition, which previously hampered the application of such systems.

Study site
Lake Geneva (locally known as Le Léman) is a perialpine lake located between Switzerland and France (46.458 °N,6.528 °E) at an altitude of 372 m (Fig. 1). It is the largest freshwater lake in Western Europe (surface area and volume of respectively 30 580 km 2 and 89 km 3 ) with a retention time of 11.4 years. Due to relatively mild winter temperatures and its large depth of 309 Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2019-47 Manuscript under review for journal Geosci. Model Dev. Discussion started: 21 March 2019 c Author(s) 2019. CC BY 4.0 License. m, complete deep convective mixing only occurs every 5 to 10 winters (Schwefel et al., 2016). The lake is composed of two parts: the large eastern basin (Grand Lac), with maximum depth of 309 m, mean depth of 160 m, mean width of 10 km in which gyres are frequently observed, and the Petit Lac, the narrow and shallow western basin (maximum depth of 70 m, mean width of 4.5 km). The centres of the two basins are some 30 km apart, which defines the cut-off distance of the EnKF (more details in Section 3). The surrounding topography is mountainous, mainly in the Southeast, hence affecting the wind circulation 5 above the basin. Lake Geneva is mesotrophic, with strong variation in turbidity and light penetration depth over the year (ranging from 3.6 to 14 m).

Model setup
The primary purpose of a three-dimensional hydrodynamic model is to solve various time-dependent, non-linear differential equations related to hydrostatic free-surface flows in a computational grid. Various modelling suites have been developed to solve those equations accounting for momentum (Reynolds-averaged Navier-Stokes (RANS)), and fluid mass (continuity), as well as heat and mass transfer. The open source Delft3D-FLOW software is used in this study . 15 Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2019-47 Manuscript under review for journal Geosci. Model Dev. Netherlands. Initially designed for coastal regions and estuaries, it has been expanded to rivers and lakes. A detailed model description of the equations and numerical schemes (conjugate gradient solver) can be found in the manual (Deltares, 2015).
We stress again that a fundamental prerequisite to any DA experiment is a calibrated model. Improper physical parameters could lead to strong discontinuities followed by waves (assimilation shocks) leading to spurious behaviours (Anderson et al., 5 2000). Assimilated variables could then, for example, go back to their pre-assimilated value. Lake Geneva's model has been extensively studied and calibrated (explicit optimization method by residuals minimization) in a previous study (Baracchini et al., 2019). This model consists of 100 unevenly distributed vertical layers, with thinner layers at the top (from 20 cm at the surface to several meters in the hypolimnion). Due to the steep bathymetry of Lake Geneva, we use the z-coordinate system (layers are horizontal) to avoid strong numerical diffusion and excessive artificial mixing. A computational time-step of 2 10 minutes is specified for the 450 m horizontal grid size to maintain model stability with the κ-ϵ turbulence closure model (Goudsmit et al., 2002). This turbulence closure model accounts for unresolved mixing at sub-grid scales. As initial conditions, the model is initialized (uniformly horizontally) from an in-situ temperature profile taken at the deepest location of the lake in January, when the lake is partially mixed. We consider a simulation period of one year, thereby covering the entire range of seasonal stratification dynamics. 15 The dynamics of a lake is mainly driven by interactions with the atmosphere and dissipation at the bed. As boundary forcing, we use MeteoSwiss COSMO-1 reanalysis products from their atmospheric model tailored to the Alpine region. They consist of various meteorological variables on a regular 1.1 km grid with hourly resolution. Seven of those variables are used in this study: solar radiation, wind direction and intensity, relative humidity, cloud cover, pressure, and air temperature. Lake Geneva is subject to strong variations in turbidity which affect the stratification mainly in early summer. Monthly time-20 series of Secchi depth observations have therefore been used in the forcing.
On computation requirements, a single deterministic one-year model run for Lake Geneva without DA, requires up to 3 days of computing time on a single Xeon Broadwell core.

Assimilation platform
OpenDA is an open interface standard. It provides access to a set of open-source tools allowing the integration of arbitrary 25 numerical models and observations through calibration and data assimilation algorithms. Its goal is to minimize algorithmic development by promoting the exchange of software solutions among researchers and users (Deltares, 2019, www.openda.org).
An OpenDA interface has recently been developed for the z-layer Delft3D-FLOW using the black-box wrapper (file-based) approach (Baracchini et al., 2019). This interface has been further expanded for DA in this study. Additions include extended 30 modifications of the Delft3D-FLOW model-definition file, model forcing files (on an equidistant grid only) for OpenDA's noise models, and a support for localization, which allows to limit the area of influence of an observation. The entire sourcecode is available on GitHub (https://github.com/OpenDA-Association/OpenDA). Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2019-47 Manuscript under review for journal Geosci. Model Dev. Discussion started: 21 March 2019 c Author(s) 2019. CC BY 4.0 License.

Monitoring data
Role of data accuracy -Key in any DA problem is the observational data and its quality (Madsen, 2003). 3D-models require an especially large amount of data to assess and validate their horizontal variability. Remote sensing observations are therefore considered together with vertical in-situ profiles to constrain the system over the surface and depth. Errors are present in the system through its initial conditions, physical processes, approximations, and forcing (Bárdossy and Singh, 2008). 5 Observations of the true system also require quantifying their uncertainties, as measurements are always an imperfect and incomplete representation (Bertino et al., 2007). This is particularly important as it defines how reliable an observation is and therefore how the model states are corrected. Injecting data with incorrect measurements error distribution into a good model could depreciate its relevance to the point where assimilation estimates are worse than the non-assimilative solution or the observations. The opposite holds true and model forecast would still be unreliable. 10 Lake in-situ data -The dataset consists of 31 temperature profiles over the water column at two locations of Lake Geneva (GE3 and SHL2, Fig. 1) sampled during year 2017. Profiles are collected at a monthly (GE3) to bi-monthly (SHL2) rhythm.
Uncertainty of in-situ temperature profiles is defined as the maximum value of the instrument precision (0.1 °C) and temporal variability at the measurement location. Reasons for the latter are twofold: first, some in-situ profiles did not have their exact collection time recorded; second, this study does not focus on reproducing short-term dynamics such as basin-scale internal 15 waves and thermocline oscillations. The standard deviation of preliminary modelling results is computed over a time window to account for this variability. The temporal variability window is defined by the period of basin-scale internal oscillations (48 hours). This procedure allows to limit physical discontinuities created by the EnKF updates from specific physical processes (i.e. internal waves) which are not the focus of this study.
The Buchillon station ( Fig. 1), consisting of a mast measuring various atmospheric and hydrodynamic properties in real-time, 20 has been used for the validation of AVHRR data detailed below. Of relevance for this study is a thermistor located at 1 m water depth representing the bulk LSWT.
AVHRR LSWT -The space-borne Advanced Very High Resolution Radiometer (AVHRR) sensor has been selected for its high temporal (up to 10 overpasses per day) and moderate spatial (1 km) resolution. We consider it to be the right trade-off for lakes; between the high spatial but low temporal resolution of Landsat 8 (90 m every 2 weeks) and the low spatial but high 25 temporal one of SEVIRI (3 km every 15 min). The access to the AVHRR data was facilitated by a direct downlink and processing chain from the University of Bern. We describe below, and in the appendix, how AVHRR can be used for DA in lakes.
The AVHRR LSWT retrieval process, with locally adapted Split window coefficients for Lake Geneva, is described in Lieberherr and Wunderle (2018) and Lieberherr et al. (2017). Only pixels with quality levels higher than 3 (Lieberherr and 30 Wunderle, 2018) are considered for the next sections.
An extensive description of the filtering of the data is available in appendix A. Overall, out of the 3372 AVHRR images of Lake Geneva available for 2017, 124 satisfy the selection criteria (see appendix A). This data is relatively evenly spread from February to October, with a maximum frequency of 1 image per 24 hours. Very few images are available in January, November and December due to bad weather conditions or cloud cover. The average lake coverage of those images is of 51 %.

Data assimilation
A short summary for popular DA approaches, their benefits and limitations is provided in appendix B. We briefly mention the Extended Kalman Filter (EKF) and Particle Filter, which are popular sequential algorithms for DA. The EKF is a variant of 5 the Kalman Filter for non-linear dynamics and consists in a linearization of the model in the neighborhood of the current estimate of the state vector. For highly non-linear systems this can result in an improper estimation of the state vector or covariance matrices and can therefore lead to quick divergence and instability (Moradkhani et al., 2005;Nakamura et al., 2006).
The Particle Filter can cope with non-linearities and obtain a full representation of the posterior distribution but its computational cost (i.e. high number of particles required) limited its use with three-dimensional hydrodynamic models (more 10 details in appendix B). For its flexibility and affordable computational cost, we further focus on the EnKF.

Ensemble Kalman Filter (EnKF)
The EnKF is an attractive alternative for non-linear dynamics and systems with high dimensionality. Reichle et al. (2002a) found that the EnKF is more robust than the EKF while being more flexible to obtain system covariances, a core element of the DA problem (Bertino et al., 2007). Indeed, whereas careful estimation of covariances often required a lot of effort ( De 15 Lannoy et al., 2007b), in the EnKF they are derived dynamically from a small ensemble of model trajectories (and therefore take into account the physics of the model), which grasps the essential parts of the error structure (Reichle et al., 2002b). The EnKF only considers a sample of the state variable to represent the processes modelled. The covariance matrix becomes a sampled covariance matrix and predictive probability density functions of the state vectors are approximated by Monte-Carlo simulations (Nakamura et al., 2006). It non-linearly propagates a finite ensemble of model trajectories instead of using a 20 linearized equation for the error covariance, no computation of derivatives is required. The EnKF still considers a linear correction procedure, and assumes Gaussian distributions of the random variables. When this is not the case, the filter still produces a variance minimizing solution, though not being the optimal estimate (Bertino et al., 2007).
We develop below the fundamentals behind the algorithm. We first define the true model state (corresponding to the actual physical state of the lake) vector x of a system at time t as xt (in our case temperature for the entire 3D model grid), ℳ a non-25 linear operator, some process noise, and u a forcing vector (here meteorological forcing) for a time t. The state propagation equation yields: In this study, the noise is added in the forcing term u. The noise term is then dropped in the notation of (2)  true state x. The forecast state (the input information for DA at time t) is defined by ̂and the analysis state obtained after DA as ̂. The model propagation equation now reads: As we do not measure the true state of the system (x), the observation (y) equation is defined by the following, with H an operator relating the system state to the observation and ε some measurement noise: 5 With the observation prediction given by: Note that in our case, we directly observe what we compute (i.e. surface temperature and profiles at computed grid points), thereby in this study H is an identity matrix. The resulting data assimilation estimate of the state vector (̂), which will be 10 used in the next cycle as restart condition, is given by: That last equation (5) is a central concept of DA, it introduces the weighting factor K, also referred to as Kalman gain. The Kalman gain can be viewed as a balance of the model and observation uncertainties, together with the error correlation of all the elements of the state vector. It aims at minimizing the error covariance of the state estimate during the analysis time (eq. 15 (5). It is defined as: with Rt the measurement error covariance matrix (in this study we assume no cross correlation between observation errors, hence Rt is diagonal and determined from the uncertainty of the measurements (Section 2.4)) and P f the a priori state error covariance matrix. Error covariance is a key component of DA. The EnKF is able to compute a time-varying covariance error 20 based on the dynamics of the system. This is a critical property when considering variables with short decorrelation spatiotemporal scales (Kuragano and Kamachi, 2000). In addition to the probability density function of the state (when in the presence of process noise), covariances estimation is achieved considering ensembles members. For an ensemble of forecasts (j = 1, …, N), each subject to a disturbance (e.g. in model processes, forcing or initial conditions), P is obtained from: From (7) we can conclude that the error spreading pattern across the domain is indeed derived from the ensemble members in a systematic way. This is not the case for some variational methods such as 3D-VAR, where the statistics are considered isotropic with little variation over time. In the EnKF each ensemble member is then updated individually (based on (5)). The state average over the ensemble provides the a posteriori state estimate. Additionally, in contrary to the Extended (or traditional) Kalman Filter, there is no need to propagate the state covariance, nor to estimate the initial state covariance and 30 model error covariance matrices. The EnKF only uses the first and second moments to construct the probability density functions, it cannot assure higher order statistics by opposition to the Particle Filter (Nakamura et al., 2006). The EnKF is widely used for large systems with uncertain initial states and variants are still being developed to leverage its limitations (Hoel et al., 2016). Several authors (Bertino et al., 2007;Evensen, 1994;Verlaan and Heemink, 2001) found better performance for highly non-linear systems in comparison to the EKF. This approach can accommodate massive datasets, missing observations, can incorporate correlated non-linear and error measurement models. Moreover, the ensembles are computationally easily parallelizable. Models with high-dimensionality are well suited for this type of assimilation, which 5 requires a relatively low number of ensemble members to produce stable and accurate results (detailed in the results and discussion sections). We used this algorithm for the results presented in this study.

System setup
The aim of this section is to detail the various properties of the EnKF and DA setup, that are specific to this study.

Stochasticity and noise -
The performance of a DA experiment strongly depends on the characterization of uncertainties ( van 10 Velzen and Verlaan, 2007). The hydrodynamics are modelled with deterministic equations. Their initial conditions, in the case Lake Geneva, only play a limited role on basin-scale dynamics over long periods (months, years). Yet, boundary conditions, especially, the air-water heat and momentum budget, still contain a large uncertainty that decrease the performance of any theoretically perfectly calibrated model. To overcome this issue, we added stochasticity to the system through its forcing. More specifically the East (u-direction) and North (v-direction) components of the wind velocity. These variables, coming from 15 MeteoSwiss COSMO-1 reanalysis products with DA, are known to be most inaccurate and influential boundary forcing over lakes.
The addition of stochasticity to the deterministic model is done with OpenDA's noise model, which adds spatio-temporally correlated noise to the wind fields. This noise model, distributing the noise based on correlation scales derived from a distancedependent function decaying to 0, requires three quantities (for both the u-and v-directions): (i) the wind standard deviation, 20 (ii) the wind spatial correlation scale, and (iii) the temporal correlation scale. They are obtained from an analysis of the COSMO-E (ensemble) products over the entire year 2017. COSMO-E probabilistic products are derived from a 21-ensemble forecast on a 2.2 km grid and contain information on the variability of the computed atmospheric variables. The wind standard deviation is hence obtained taking the mean COSMO-E standard deviation of every pixel over the lake for the studied period.
The spatio-temporal correlation scales are obtained from computing the cross-correlations of six fictive stations around the 25 lake, as shown by Fig. 1. The cross-correlation of a station with itself provides the temporal correlation scale, while the crosscorrelation among stations allows to determine the spatial correlation scale. Table 1 summarizes the noise-model parameters aforementioned.

Ensembles -
The EnKF operates using a statistical sample of the state of the system. The ensemble size (N) is often determined 5 heuristically and must be a balance between a good representation of the state space and acceptable computation time. The errors in the solution pdf will approach zero at the rate 1/sqrt(N) (Evensen, 2003). A preliminary study showed that a satisfying compromise is obtained with 20 ensemble members. The choice for a small number of ensembles is further motivated by future use for operational purposes. More details and an ensemble size assessment are presented it the results section.
Localization scheme -It has been mentioned before (Section 3.1) that the covariance matrix links every domain point with 10 each other. Covariances are derived from the ensemble members. A limitation of a small ensemble size is possible spurious correlations (Evensen, 2009) resulting in artefacts over long distances from the observation location. In such cases, when model spatial extent is large (an observation usually only influences its near vicinity, it has limited influence for greater spatial extensions (Stanev et al., 2011)), a localization scheme has to be applied. Such scheme has therefore been implemented in OpenDA, which collaterally also aims at reducing computation cost of the analysis time. This localization allows to define a 15 cut-off distance, based on a Gaspari-Kohn function (an isotropic distance-based function decaying to 0 at a defined cut-off value), to limit the area of influence of an observation. This function ensures a smooth transition between a full and non-update for better model stability. Effectively, this removes long-range spurious correlations by scaling the size of the observation covariance matrix.
In this study, a cut-off distance of 15 km is defined. This distance is based on the two in-situ stations spacing and the radius of 20 their associated basin gyres (Petit Lac and Grand Lac). This is further motivated by the fact that such distance allows to cover the entire interior of the basin by an update of in-situ data. Due to the significant depth of the lake, dynamics at deeper locations are less variable, hence their correlations at longer distances are easier to estimate. Regarding the LSWT, as it is partly the result of surface heat fluxes, its spatial structure is also expected to be correlated, to some extent, at relatively large spatial scales. Finally, as a result of the coarse vertical resolution of the in-situ profiles, we didn't define a different vertical localization 25 scheme than the horizontal one in the vertical direction.

Results
In this section, we present both quantitative and qualitative results of the DA experiment. As mentioned in Section 2.4, the DA run consisted of the assimilation of 128 AVHRR LSWT images and 31 in-situ profiles over the entire year of 2017. Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and a Taylor diagram (Taylor, 2001) are used as benchmark indicators. Direct model comparisons with satellite images and in-situ profiles are provided to visualize the benefits of the 5 approach both for surface and deep water dynamics. Implications of the DA for physical phenomena are presented. Table 2, providing MAEs and RMSEs before and after DA, indicates significant improvements over the baseline simulation.
RMSE and MAE values are reduced by 54 and 60 %, respectively. The discrepancy between the two indicate some occasional large data-model mismatch, which affects the RMSE more heavily. The Taylor diagram (Fig. 2), displays large improvements in centered Root Mean Square Difference (RMSD), correlation and standard deviation. 10  could update the model in areas where no data was present. Model accuracy is thereby improved at basin-scale rather than at observation locations. This is particularly relevant as large lakes are often partly cloudy. The second example demonstrates the potential of DA to correct the state variable -a cold bias in the present case -while maintaining the coherent structure of the complex spatial thermal gradient (Fig. 3, second row). The third example shows gyre-like flow structures. Such rotating structures are difficult to observe from AVHRR LSWT data (third row of Fig. 3) partly due to its limited spatial resolution and 5 its weak signature at the surface. However, a gyre created by an N/N-E wind on August 12 th is better visible in the model results (clockwise in the western part of the main basin and counter-clock wise in the center). In that case, the DA updated the LSWT while keeping the physical structure and flow spatial coherence of the control run. Finally, the lowest panels in Fig. 3 show how DA improve observations and future quantification of transient upwelling. While the upwelling in the Petit-Lac was partially already caught by the control run, the DA allowed adjusting its intensity and extent. Another similar case is presented 10 in appendix C.   The benefit from DA is also evident when looking at the temporal evolution of LSWT (Figs. 4 and 5). In Figs. 4 and 5, the AVHRR LSWT is again compared at two locations with the simulations with and without DA. The observed summer strong 5 temporal variability with bi-weekly temperature variability exceeding 5 °C is not well resolved in the control run (Fig. 4). It is however much better reproduced in the DA run. The warming phase also benefits significantly from the assimilation. The control run surface temperature started to increase in the second part of March, while the warming occurred early March in the observations and DA. Both models are in good agreement during the cooling phase from August to the end of the year. While few observations were available during this late period, not much improvements are obtained for the baseline, which was 10 already accurate. Overall, every point of the DA run is close to or at least within the ~1 °C uncertainty of the AVHRR observations (see Section 2.4 for more information on data uncertainty). Fig. 5, which provides a close-up on time-series of the summer period in the western basin (Petit-Lac). Ensemble spread is smaller during the period of strongest stratification from late July to late August. Overall, the model uncertainty arising from perturbed wind fields reaches 2 °C in the summer, when it is the highest and 1 °C on average. 15

Similar conclusion arises from
Major upwellings in June and July are caught by both model runs, although the intensity is too weak in the control run. Again, data-model discrepancies and temperature variability are largest from late May to early August. Starting in August, the models with and without DA exhibit similar dynamics, both close to the observations.

Figure 5: Zoomed time-series of LSWT in the centre of the western basin (Petit Lac). The red line corresponds to the mean of the ensemble, the red shaded area to the ensemble spread, the blue line to the control run, and the black dots to AVHRR observations.
We further compared the upwelling of September 15 th with river temperature data with a model surface grid point located 3 km away (Fig. 6). The upwelling has indeed been observed in the lake outflow, dropping from 21 °C to 12 °C in a matter of 6 5 days. The figure shows that the control run underestimated the upwelling by 5 °C, while the DA run underestimated it by ~2.5 °C. The AVHRR observation is 1.5 °C warmer than the river temperature. Fig. 6 also allows to see that the model doesn't suffer from spurious behaviour after an assimilation. Model shocks are not observed and numerical equilibrium is reached in a sub-daily time frame.

Figure 6: Close-up on the upwelling event of mid-September. River temperature from the lake outlet in Geneva is added as comparison. AVHRR data (black dots), control run (blue) and DA run (red) correspond to a surface pixel 3 km from the outflow.
Deep-water assimilation -We finally investigate how the vertical structure and sub-surface dynamics are affected by the DA. 5  Ensemble member size -Finally, we evaluated the EnKF ensemble size needed by a convergence analysis. A period of 1. 5 5 month, from June to mid-July with a spin-up time of 2 weeks (without DA), is selected for assessment. This period of weak spring thermal stratification has been selected, as it is the time of the year with the most complex and broadest range of dynamics (Fig. 4).
The results indicate that for an increasing number of ensembles (N) the analysis error decreases. Table 3

provides RMSEs and
MAEs for different ensemble sizes. Fig. 8 provides similar results while differentiating between assimilated data sources. We 10 conclude that the major gains are achieved with 10 ensemble members. For in-situ data only, 20 ensembles seem to be the sweet-spot. Due to the much larger amount of AVHRR observations (i.e. one image provides thousands of observations since it covers the entire spatial extent of the computational grid), the red (AVHRR) and black (all measurements) lines are confounded. Finally, assessment of the ensemble spread showed that few gains in second-order moments were found with larger ensemble sizes. Indeed, in the scope of this study, the additional benefits for N > 20 are limited. At this stage, the 1 °C 15 uncertainty of the RS images might become a limiting factor hindering further improvements.  With a view towards DA for operational lakes forecasting systems, and the computational constraints associated with real-5 time hydrodynamics, we conclude that 20 members provide a satisfactory compromise for the system considered in this study.

Discussion
The DA framework has brought significant improvements to the hydrodynamics of Lake Geneva. It demonstrated its 10 effectiveness to improve various model-forecasted meso-to large-scale thermal features. The combination of both in-situ measurements and remote sensing observations allowed constraining the 3D thermal structure of the model throughout the water column.
Surface time-series (Fig. 4) indicated that spring / early summer observations play a key role in improving the model performance during the warming period (Kourzeneva, 2014). This allows for an adequate modelling of the lake warming, with 15 significant implications expected for water quality models and the typical spring phytoplankton blooms. Later in the year, latespring and summer, the AVHRR data revealed high variability temperature dynamics (e.g. upwellings) which are not reproduced by the control run. It is the time when the largest ensemble spread is observed, which indicates that the summer LSWT is sensitive to changes in wind patterns. Additionally, ensemble spread stemming from spatio-temporally correlated noise applied to the wind fields indicate that the model is sensitive to the changes in this forcing function. The effects on model outputs are correctly described and the uncertainty arising from this perturbation ranged from 1 °C on average with peak values at 2 °C. 5 Similar conclusion can be drawn for sub-surface thermal dynamics. Fig. 7 indicates that data-model mismatches in the mixed layer appeared as the lake started to warm and the thermocline formed. Compared to the control run, the DA run exhibited a more accurate warming phase and vertical temperature gradient during the stratified period.
Overall, the performance of the EnKF has been notable in a broad range of scenarios. Fig. 3 showed that even with complex observational patterns, filter updates were performed with different amplitudes at each spatial location. Those spatially varying 10 updates often are in agreement with the physical processes governing the hydrodynamics of the lake. Also, in the case of incomplete and sporadic data, the EnKF updates behaved well and good combinations of data and system dynamics were found. Some authors (De Lannoy et al., 2007b) found that when the update is performed through the covariance propagation (in case of missing observation), the a posteriori state might not be correct and counteract the updates in the surrounding locations. This behaviour has not been observed in the presented hydrodynamics of Lake Geneva. This indicates that the 15 covariance matrices were well estimated from the ensemble members and their physical dynamics. The non-static covariance matrix derived from the EnKF allows longer-term studies, such as over the entire year, with complex changes in the thermal structure of the waterbody. Time-varying covariance error estimates for 3D models is a complex task in DA. Analysis updates were not intense nor frequent enough to cause model shocks or solver failure. This would have a minimal impact on the surface layers, since such corrections would not be persistent due to the rather variable nature of surface layers and sensitivity to 20 atmospheric forcing. However, more issues would arise from model shocks in the deep water, which could trigger movements of large water volumes. Since in-situ profiles have a much lower uncertainty than AVHRR observations (<0.1 °C vs 1 °C, Section 2.4), intense state updates are more likely, however they have not been observed in this study and no model solver failure arose from the EnKF updates. After significant updates, the model generally recovered in a sub-daily time frame.
Increasing the observational frequency (here limited to one satellite observation per day) would increase the likelihood of 25 encountering model shocks, as equilibrium adjustment may not be reached between updates. Higher computational cost also weights in the data quality/quantity compromise, particularly when considering near real-time systems.
Physical processes - Fig. 3 showed that various physical processes, such as upwellings and gyres, are better resolved with the use of EnKF. Phenomena such as upwellings typically occur more prominent at the beginning or end of the season, when stratification is weaker. The better identification of such processes is of prime importance for various water quality aspects 30 (e.g. heat extraction (Gaudard et al., 2018), wastewater discharge, water intakes, etc.). Yet the magnitude of such events has rarely been quantified, due to difficulties with their large-scale identification. Through the combination of remote sensing observations and three-dimensional hydrodynamic modelling, we open new possibilities for monitoring of such phenomena.
In this study we found that upwellings are better reproduced both in intensity and spatial extent. Comparing temperature measurements with a surface model grid point 3 km away from the outflow showed good agreements after DA. An underestimation of the upwelling of 2.5 °C after DA is observed (compared to 5 °C with the control run). Most of this remaining difference can be attributed to the satellite underestimating the event as well (by 1.5 °C, with an uncertainty of 1 °C) and the remoteness and depth (surface) of the pixel compared. For Lake Geneva, this is of particular interest when an upwelling occurs in the western basin (Petit-Lac), dropping the outflow temperature for millions of downstream residents. In terms of gyres, 5 those structures are repeatedly observed in Lake Geneva (Bouffard et al., 2018;Kiefer et al., 2015). Because of the Coriolis force, subsequent strong up-/down-lifts of the thermocline occur, which structure the lateral dispersion of primary productivity (Soomets et al., 2019).
Ensemble size -Among the various ensemble sizes, assessed for this study (Fig. 8), we found that relatively small ensemble sizes (~20) are large enough to derive suitable time-varying covariances and error spreading patterns. This is particularly 10 important in the presence of variables with short decorrelation time and spatial scales. Studies indicated that relatively small ensembles fail at accurately estimating the small correlation patterns of remote observations (Houtekamer and Mitchell, 2001).
The localization scheme implemented (Section 3.1), defining a cut-off radius around each observation, allows to circumvent this limitation. Houtekamer and Mitchell (2001) found that for an increasing ensemble size, the optimal cut-off value increases as well. Larger ensemble sizes not only restrain the underestimation of ensemble spread and accuracy, but also allow the use 15 of more remote observations. For DA experiments with limited data, larger ensemble sizes may be a requirement to maximize observational coverage.
Limitations and perspectives -A main limitation of the EnKF is the Gaussian assumption, which in the case of large datamodel mismatches, could have led to artefacts and unrealistic a posteriori state values. This has not been observed in this analysis with the provided noise definition and observational stochastic setup. Furthermore, while we did not study 20 systematically the physics after each analysis step, we think the method can still be used for the study of physical processes, provided the user assesses the intensity of those physical discontinuities. Out of the 152 assimilations, only 8 created some numerical instabilities in the model, though small enough to prevent solver failure.
Other difficulties arise in the presence of bias, where Kalman Filtering performs suboptimal corrections (Dee and Da Silva, 1998), as observations and model are assumed unbiased. Solutions for dealing with biases in EnKF may become necessary 25 (De Lannoy et al., 2007a). In the present approach, however, occasionally occurring model biases have been effectively handled by the update. The DA model did not drift back to its biased or control run state. We believe that this is a result of the adequate initial parameterization of the model (Baracchini et al., 2019). This further highlights the crucial importance of accurate model calibration and formulation before proceeding with DA experiments. It is worth noting that the EnKF is able to provide updates also to parameters and forcing conditions, which in some cases, may provide more persistent improvements 30 (for example when time-varying parameters are needed).
This DA experiment is time-consuming from a computational aspect. For example, it took nearly one month to compute the present setup on a dual Intel Xeon E5-2697v4 processors machine with 256 GB of memory, generating close to a terabyte of data. While the analysis time for in-situ data has been reasonable (~1 hour), the immense amount of observations generated Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2019-47 Manuscript under review for journal Geosci. Model Dev. by an AVHRR image (entire coverage of the surface layer of the computational grid) brought the analysis time up to 3 hours for a single image. This is largely due to the current lack of multi-core support for the analysis step. A multi-core local analysis has been implemented in the scope of this study for multi-variables (e.g. temperature with water levels and or flow velocities) assimilations, but gains can further be obtained from a local analysis based on the observation localization scheme.

Conclusion 5
For managerial and scientific purposes, new monitoring and forecasting tools, covering wide ranges of spatio-temporal scales, are of great interest. The coverage of such scale breadth of inland waters is achieved by combining three information sources, namely (i) in-situ measurements, (ii) remote sensing observations and (iii) model simulations. With data assimilation (DA), optimal combinations can be achieved and valorised.
For several decades, DA has been applied in oceanography and atmospheric sciences, yet its presence in limnology remained 10 limited. In this study, we propose a flexible framework and tools to blend real data into model simulations tailored to lakes.
We applied this method to Lake Geneva using large datasets consisting of a three-dimensional hydrodynamic model, AVHRR lake surface temperature and in-situ profiles over an entire year. Results demonstrated the effectiveness of DA as significant gains were obtained for both the surface and deep water dynamics over a well-calibrated baseline. We showed that both data types (in-situ and remote sensing) are important to constrain the entire spatial extent (horizontal and vertical) of the model. 15 Results also indicate that AVHRR data is a valid RS data source for DA into lake hydrodynamics, provided that observational error and uncertainties are well defined.
In that regard, the use of an Ensemble Kalman Filter (EnKF) allowed to handle non-static covariance estimation, a key element of any DA problem. Additionally, it is able to account for the uncertainties of each data source. Those are essential elements influencing DA performance (Qi et al., 2014). We found that the ensemble size played an important role in reducing model 20 errors. To keep their number limited, a localization scheme has been implemented, hence circumventing the estimation of improper small correlations at large distances (Houtekamer and Mitchell, 2001). In that regard, while the EnKF adds computational complexity to the problem, it is capable of estimating dynamically the stochastic model based on the physical properties of the system. This is well encompassed by the paradox defined by (Bertino et al., 2007), stating that simple DA methods become complicated engineering tasks, when the inconsistency between the stochastic and the physical model 25 becomes relevant. Due to the flexibility of the tools developed and used, we expect this procedure can be transferred to other lakes and hydrodynamical models with relatively limited development.
To conclude, this method has been designed with the vision of a future near real-time applications. Implications of DA in the operational context are significant to provide robust and timely short-term forecasts, accurate reanalysis products, and uncertainties for reliable water management. Over the last decades, the number of remote sensing products available grew 30 rapidly, however they have hardly been used in the operational context in an optimal way (de Rosnay et al., 2013). The timely retrieval of RS products requires interdisciplinary efforts to ensure robustness and the proper error definition of the data, which hinders the development of such operational systems (van Velzen and Verlaan, 2007). In this study we provided an example of how the entire chain, from the satellite to the assimilation into the model, can be performed, with limited field infrastructures.
More concretely, we expect the findings of this study to be directly applicable to existing lake forecasting platforms, such as the one for Lake Geneva (http://meteolakes.ch). Impacts of such implementation are expected at scientific, governmental and public level. 5

Code and data availability
Software -The source code and documentation of the numerical model (Delft3D-FLOW) and data assimilation platform (OpenDA) developed in and for this study can be accessed and downloaded on their online repositories at https://oss.deltares.nl/web/delft3d/source-code and https://github.com/OpenDA-Association/OpenDA.

Data -
The authors are grateful to the following institutions that provided the data used in this paper: the Federal Office of 10 Meteorology and Climatology (MeteoSwiss) for meteorological data, the Département de l'environnement, des transports et de l'agriculture (DETA) du Canton de Genève for in-situ data on Lake Geneva at GE3 and the Federal Office of the Environment (FOEN) for the river data temperature in the outlet of Lake Geneva. In-situ data at SHL2 as well as Secchi disk measurements in Lake Geneva were provided by the Commission International pour la Protection des Eaux du Leman (CIPEL) and the Information System of the SOERE OLA (http://si-ola.inra.fr), INRA, Thonon-les-Bains. This data cannot be published as it 15 belongs to their aforementioned owners, it is not the property of the authors of this study. It can nonetheless be requested by contacting its respective institution. Any other data used in this study is property of the Physics of Aquatic Systems Laboratory at EPFL and can be obtained by contacting Prof. Alfred Johny Wüest (alfred.wueest@epfl.ch).
However, comparison with field data showed that this is still not reliable enough. The discrepancy is indeed strongly linked to day-night cycles, but those are also season-dependent. Therefore, no specific satellite overpass can be selected for the entire computational time (one year). To ascertain that images portray an accurate representation of the lake bulk LSWT, the thermistor at 1 m depth (recording at a 1 hour interval) has been used for a direct comparison with the space-borne AVHRR data. Considering that the Buchillon station is close (80 m) to the shore, its position has been shifted 2 km south to avoid land 5 boundary contamination. Finally, the average of a 3x3 pixel window of the satellite image centred on the south-shifted Buchillon station coordinates is used as comparison point with the field data. Only images with an absolute deviation with respect to the bulk water lower than 1 °C are retained for further assimilation. The 1 °C threshold will also define the AVHRR observational uncertainty needed for the EnKF. Outlier pixel values colder than 4 °C and warmer than 28 °C are removed.
Finally, to avoid assimilating observations at a too high frequency (or too close in time), which can result in physical 10 discontinuities and destruction of model processes (model not reaching equilibrium between assimilations), the maximum frequency of satellite images is limited to 1 per 24 hours. The screened images are then mapped to the computational grid.
This procedure aims at bypassing the skin to bulk temperature effect, while ensuring best data quality for assimilation. This procedure assumes horizontal uniformity over the lake area (i.e. atmospheric effects are assumed to be the same over the entire domain), and may be sensitive to local cloud patches. 15

B. Assimilation approaches
The multiple methods proposed for DA mainly fall into two categories: (i) variational (e.g. 3D-VAR, 4D-VAR) and (ii) sequential methods (e.g. Kalman Filtering, Particle Filtering). For variational methods, the optimization of the model states (or parameters) is based on the minimization of a cost function. Variational methods are popular in meteorological forecasting (Rawlins et al., 2007). However, the computational burden associated with the collection and storage of data can be significant. 20 Moreover, batch processing of data reduces flexibility and complicates the consideration of time-varying model parameters.
Sequential methods are robust techniques for DA in a broad range of applications. For linear dynamics and measurement processes with Gaussian error statistics, the Kalman Filter (Kalman, 1960) is an optimal sequential DA algorithm. However, most processes observed in nature, such as hydrodynamics, are non-linear. The analytical solution provided by the Kalman Filter can therefore not be derived in order to compute the posterior distribution of simulated variables. To overcome this 25 limitation variants exist, such as the Extended Kalman Filter (EKF), which consists in a linearization of the model in the neighborhood of the current estimate of the state vector. This linearization can lead to complicated calculations for systems with high dimensionality, as the integration and propagation of the error covariance results in a significant computational demand (Gillijns et al., 2006). Linearization is done using first-order Taylor expansion, which implies a closure at the secondorder moments. For highly non-linear systems this can result in an improper estimation of the state vector or covariance 30 matrices and can therefore lead to quick divergence and instability (Moradkhani et al., 2005;Nakamura et al., 2006).
In order to cope with non-linearities and obtain a full representation of the posterior distribution, other statistical methods, such as Particle Filters have been developed (Carpenter et al., 1999). The Particle Filter is a solution following a Darwinian-like process of survival of the fittest. It shares properties with EnKF in the sense that the particles are the ensemble members.
Particle Filters do not need any assumption for the state variable distribution (e.g. Gaussian) and can deal with non-linear observation models as well. The updates being applied on particle weights rather than the state variable, which results in less numerical instabilities for process-based models (van Leeuwen, 2009;Liu et al., 2012;Moradkhani et al., 2005). A major drawback is the particle depletion, which requires complex resampling algorithms. Moreover, it is less computationally 5 efficient than the EnKF due to the need for a high number of particles (more particles than EnKF ensembles are often needed, in the order of tens of thousands). Despite its advantages, the use of the Particle Filter as an assimilation method in oceanography and limnology is limited due to its computational cost. To address such issue, solutions are undergoing development (Šukys and Kattwinkel, 2018). For its flexibility and affordable computational cost, we further focus on the EnKF.