Mitigation of Model Bias Influences on Wave Data Assimilation with Multiple Assimilation Systems Using WaveWatch III v5.16 and SWAN v41.20

High-quality wave prediction with a numerical wave model is of societal value. To initialize the wave model, wave data assimilation (WDA) is necessary to combine the model and observations. Due to imperfect numerical schemes and approximated physical processes, a wave 15 model is always biased in relation to the real world. In this study, two assimilation systems are first developed using two nearly independent wave models; then, “perfect” and “biased” assimilation frameworks based on the two assimilation systems are designed to reveal the uncertainties of WDA. A series of “biased” assimilation experiments is conducted to systematically examine the adverse impact of model bias on WDA. A statistical approach based on the results from multiple assimilation 20 systems is explored to carry out bias correction, by which the final wave analysis is significantly improved with the merits of individual assimilation systems. The framework with multiple assimilation systems provides an effective platform to improve wave analyses and predictions and help identify model deficits, thereby improving the model.


Introduction 25
Ocean waves, referring to the ocean surface gravity waves driven by wind, are important physical processes in the study of multiscale coupled systems. Many studies show that ocean waves are necessary for upper ocean mixing processes, whether in small-scale coastal simulations or large-scale global climate simulations (e.g., Babanin et al., 2009;Huang and Qiao, 2010;Qiao et al., 2004Qiao et al., , 2010. The existence of ocean waves can modify the structures of both atmospheric and marine boundary 30 layers by providing sea surface roughness, wave-induced bottom stress, breaking wave-induced mixing, and so on, which ultimately influence air-sea momentum and heat exchange. Therefore, ocean waves are an important component in atmosphere-ocean interaction flux processes (e.g., Chen et al., 2007;Doyle, 2002;Liu et al., 2011;Warner et al., 2010). In addition, the study of ocean waves can reduce and prevent marine disasters and provide guidance for development of the social economy (e.g., Folley 35 and Whittaker, 2009;Rusu, 2015;Wei et al., 2017). Thus, studying ocean waves is of great scientific and social significance.
At present, ocean wave observational techniques are constantly being improved (e.g., Daniel et al., 2011;Hisaki, 2005). Except for traditional buoy observations (e.g., Mitsuyasu et al., 1980;Rapizo et al., 2015;Walsh et al., 1989), satellites can provide much near real-time wave observational information, 40 which is beneficial for understanding the state of ocean waves (e.g., Gommenginger et al., 2003;Lzaguirre et al., 2011;Queffeulou P, 2004). However, observations always represent scattered samples in time and space in the real world and therefore do not represent the complete three-dimensional structure and temporal evolution of real world waves.
Numerical wave models are a powerful tool for studying the physical processes of ocean waves and 45 predicting future wave states. Following the development of the previous two generations, thirdgeneration wave models, such as WAve Modeling (WAM) (WAMDI Group, 1988), WaveWatch III (WW3) (Tolman, 1991), Simulating Waves Nearshore (SWAN) (Booij et al., 1999), and MArine Science and Numerical Modeling (MASNUM) (Yang et al., 2005), integrate the spectral action balance equation describing the two-dimensional ocean wave spectrum evolution without additional ad hoc 50 assumptions regarding the spectral shape, and these third-generation models are more robust for arbitrary wind fields than previous models. However, there are generally three error sources in wave models. One error source is from an incomplete understanding of the physical processes, approximate expressions of the numerical discretization schemes and so on, which causes systematic errors that are usually referred to as wave model bias. The second error source is due to inaccurate wind forcings of 55 wave models. The third error source is from the initial condition uncertainties, which can grow due to nonlinearity of the model equations during model forwarding. In this sense, the model simulated waves do not represent the real world either.

Three models
In the wave models, the variance spectrum or energy density E(σ, θ) is a quantity that represents the wave energy distribution in the radian frequency (σ) and propagation direction(θ). Without ambient ocean currents, the variance or energy of a wave package is conserved. However, if the current is involved, due to the work done by the current on the mean momentum transfer of waves (Longuet-95 Higgins andStewart, 1961, 1962), the energy of a spectral component is no longer conserved. In general, an action density spectrum defined as N=E/σ is considered within the models. Then, the governing equation of the wave model can be written as follows: (1) The left-hand side of this equation is the kinematic process during wave propagation. The second term 100 describes the wave energy propagation in two-dimensional geographical space denoted by , →. The / 0 → is the group velocity that follows the dispersion relation. The third term represents the effect of shifting the radian frequency due to variation in depth. The fourth term represents the depth-induced refraction. 4 and 6 are wave velocities in the frequency and direction , respectively. On the right-hand side, )@) is the nonconservative source/sink term representing all physical processes that generate, dissipate, 105 or redistribute wave energy. Typically, there are three important physical processes that contribute to )@) , which include the atmosphere-wave interaction, nonlinear wave-wave interaction, and waveocean interaction. In a shallow-water case, additional processes must be considered, such as wavebottom interaction, depth-induced breaking, and triad wave-wave interaction.
In this study, we use three advanced third-generation spectrum models, WAve Modeling (WAM, Cycle 110 4.5.4) (WAMDI Group, 1988), WaveWatch III (WW3, version 5.16) (Tolman, 1991), and Simulating WAves Nearshore (SWAN, version 41.20) (Booij et al., 1999), which have improved a lot from their predecessors, such as numerical and physical approaches. These wave models have the same form of governing equation, however their numerical implement processes are different due to different consideration. For example, SWAN is more focused on wave propagation processes in shallow water, 115 while WAM and WW3 pay more attention in deep water. For more comparisons, please see section 2.2.

Model configurations
Three wave models use two-dimensional spectral space containing 29 frequencies that cover from .035 Hz to .555 Hz with a logarithmic distribution and 24 equidistant directions. The geographic space is 120 from 180° W to 180° E in the zonal direction and 75° S to 75° N in the meridional direction with a 1°×1° grid resolution. The topography in this study is taken from the high-resolution ETOPO1 dataset provided by NOAA (website: https://www.ngdc.noaa.gov/mgg/global/). The wind forcing has two sources, both of which have 6-hour time intervals. The first dataset is the ERA-Interim reanalysis from European Centre for Medium-Range Weather Forecasts (ECMWF), with a resolution of .75°×.75° 125 (http://apps.ecmwf.int/datasets/data/interim-full-daily/). The second dataset is the CFSRv2 dataset from NCEP, with a resolution of .205° (longitude) ×.204° (latitude) (https://rda.ucar.edu/datasets/ds094.1/).
The time-step of all three models is 15 minutes. All relevant parameters above are set to be identical for every wave model (WW3, SWAN, and WAM) in this study.

Data 130
The AVISO (Archiving, Validation and Interpolation of Satellite Oceanographic) data (https://www.aviso.altimetry.fr/en/data/products/) are the satellite observational products used in this study. For ocean waves, the AVISO has two satellite altimetry products: along-track data and gridded data. The along-track data are used as the observational data source for the wave data in the simulation (sampled from "truth" in the "twin" experiments). The gridded data are used to validate the wave 135 simulation and assimilation (1°×1° resolution with 1-day time intervals). During wave simulation, the significant wave height (SWH) is used as a basic observational variable for data assimilation, which is provided from three ongoing satellites: Jason-2, Jason-3, and Satellite for Argos and ALtiKa (SARAL).

Figure 1
shows one-cycle ground orbit by taking Jason-2 and SARAL as examples. Jason-3 is the successor of Jason-2, and both satellites share the same orbit. 140

Different modeling strategies in WW3 and SWAN
Since the observations are only a sample of real world information, the model bias (i.e., systematic difference between a numerical wave model and the real world) is not well defined against the real world. In this study, we use the systematic difference between the WW3 and SWAN models to simulate the model bias and study the influences on wave data assimilation (WDA). 145 First, let us distinguish the difference in physical and numerical aspects to comprehend the causes of "bias" between these two models. In general, WW3 addresses global scales, and SWAN is more applicable in shallow water. Although the two models have most of the same physical processes, such as the wind input and nonlinear wave-wave interactions, each can provide multiple parameterization schemes to choose. For example, the nonlinear wave-wave interactions in SWAN include the Discrete 150 Interaction Approximation (DIA) (Hasselmann et al., 1985) and the Webb-Resio-Tracy (Resio and Perrie, 1991;Van Vledder, 2006;Webb, 1978), while there are more choices in WW3, such as the Generalized Multiple DIA (Toman, 2004(Toman, , 2013, the Two-Scale Approximation and Full Boltzmann Integral (Perrie et al., 2013;Perrie and Resio, 2009;Resio et al., 2011;Resio and Perrie, 2008), as well as the Nonlinear Filter scheme (Tolman, 2011). In order to reduce more uncertainties, same scheme is 155 chosen for the same physical process. In numerical aspects, there exist different implementation strategies such as the differencing method, which also contributes to bias.

Two data assimilation systems using WW3 and SWAN
To explore the model bias influences on WDA, we develop two data assimilation systems based on WW3 and SWAN in this study. 160 Generally, based on the program structure of wave models, we insert the assimilation module between calculations of the two-dimensional wave spectrum and outputs of wave parameters so that at the assimilation time, we call on the assimilation module to update the spectrum and SWH. When building the data assimilation systems, we need to consider the different structures of parallelism method, data storage, and information exchange in WW3 and SWAN models as noted in section 2.2. 165 To clearly demonstrate the influences of model bias on WDA and minimize its adverse impact, the analysis scheme in both assimilation systems is optimal interpolation (OI), which also is low cost and easy to operate. We implement the OI analysis in three to four steps. The first step uses two Gaussian convolutions of the background and observed SWHs to compute the observational increment of SWH at the observational location. The second step projects the SWH observational increment onto the 170 model grids centered at the observational location but within an impact radius using linear regression.
The third step transforms the analyzed SWH to update the spectrum of model waves. The fourth step corrects wind forcing using the observational SWH.
Step 1: Computing observational increment by convolution of two Gaussians increment at the observational location k, ∆ C D (H represents SWH), is computed by the convolution of two Gaussians of the model background and observation, which can usually be obtained from model ensemble members and observational samples. ∆ C D is formulated as follows (Zhang et al, 2007): Here, the first and second terms on the right-hand side of the equation adjust the ensemble mean and 180 ensemble spread, respectively, and ∆ C P represents the prior model spread. Superscripts O and M denote the observation and prior quantity estimated by the model, respectively. is the corresponding error standard deviation of SWH and varies in time, space and differs in every wave model. The overbar denotes the ensemble mean. In this simplified case, we specify P = 0.6 , D = 0.25 , assuming that is constant in all conditions similarly with the previous study (e.g., Qi and Fan, 2013), 185 and use single model and observational values as the ensemble mean.
Step 2: Regressing the observational increment onto model grids The second step projects the observational increment ∆ C D onto the related model grids using background error covariance, which is a key step in the analysis. To simplify the problem and improve the computational efficiency, many studies use a flow-independent distance function to sample the 190 background error covariance for computing the analysis increment at the model grid i, ∆ W X , as ∆ W X = ( W Y ) Z exp (−(^_ ,L )) × ∆ C D . Usually, such an expression is only a symmetrical approximation of the correlation function and cannot represent the spatial structure and propagation characteristics of waves.
Here, with the assumption that wave covariance has same structure as wave error covariance, we modify the covariance formula to increase its representativeness for wave structure by superimposing a 195 statistical correlation coefficient obtained from the outputted SWH of model simulation into the formula. After analysis, the equation becomes ∆ W X = 0, W,C > where L is the characteristic length and W,C is the distance between the model grid i and observational 200 point k. When W,C is larger than the impact radius R, there is no observational impact on this model point from observation k. All variables with superscript s represent the model statistics from free model control results. For example, W,C Y is the SWH covariance between the model grid i and observation k, which is evaluated from the model data time series in corresponding experiments. To ensure the local characteristics of ocean waves, in this study, the characteristics length L and impact radius R (or the 205 largest W,C ) are the same, causing this incremental projection to reach to the e-folding scale. Referring to previous studies (e.g., Lionello and Gunther, 1992;Qi and Fan, 2013), we tested different values of L and R as 300 km, 800 km, and 1000 km and found no essential improvement with larger L and R values. Trading-off with computational efficiency, we set L and R as 300 km throughout this study. As shown in Fig. 2, the new covariance represents more wave physics, i.e., the correlation has more 210 asymmetrical and wave-dependent characteristics.
Step 3: Transforming the SWH to wave spectrum The assimilation SWH W X is a sum of the prior W P and the analysis increment from step 2 ( W X = W P + ∆ W X ). In the wave model, the form of ocean waves is a two-dimensional wave spectrum that is distributed over frequency and phase. Thus, transforming the assimilation SWH to wave spectrum is 215 necessary to update other wave parameters. Following the previous study (Qi and Fan, 2013), we assume that the change in wave spectrum is proportional to the energy change that is expressed by the square of SWH. Then, the analyzed spectrum W X ( , ) can be written as follows: where f is the wave frequency and is the phase direction. 220 Step 4: Correcting wind forcing using SWH data If the assimilation only adjusts the wave spectrum as described in Step 3, the updated spectral structure may be quickly overwritten by erroneous wind. In this step, we describe a simple scheme using the observed SWH data to correct the wind forcing. Starting from a first guess of wind (the ERA-Interim reanalysis, for instance), the analyzed wind W,p X at model grid (i, j) can be written as follows: 225 where W represents either the u or v component of wind. ∆ W,p is the corrected wind increment transformed from the updated SWH. While the details of the transformation scheme can be found in Lionello et al. (1992Lionello et al. ( , 1995, we comment on certain aspects relevant to our study. Regardless of boundaries, in general, the energy of ocean waves is determined by the wind speed and duration, which 230 can also be expressed by SWH. In that sense, a function equation can be built, in which the left-hand side is an expression of wind speed and duration, while the right-hand side is an expression of SWH, and they are balanced through wave energy. Then, the analyzed wind speed can be resolved under the assumption that the duration is same in the both prior and analyzed fields. With respect to the configuration of wave model data assimilation, the model time step is 15 minutes 235 and the assimilation interval is 1 hour. At the assimilation time, we assimilate the along-track observations within a 1-hour time window centered at the time. After 10 days, all the observations will cover the global area. The wind data from the reanalysis products (ERA-Interim and NCEP-CFSR in this case) are available every 6 hours. To incorporate the wind correction into the wind forcing of the model, we distribute the wind correction to the adjacent two time-levels of wind data. As the process is 240 looped forward as the wave model state is updated, the wind forcing is adjusted through the SWH assimilation.

Experimental design
Throughout this study, we use the symbol MAO(s) WF as the name for the assimilation experiment. Here, "MA" stands for the "assimilation model" and the subscript "O(s)" (resp. superscript "WF") represents 245 the observing system (resp. wind forcing) in the assimilation. The wind forcing is either the ECMWF ERA-Interim (hereafter known as ERAI) or NCEP-CFSR wind (hereafter known as CFSR). Wind forcing can also be corrected by observations of SWH (under this circumstance, the superscript "WF" is replaced by "ASSW"). The observations used in the assimilation could be the model data but are projected on the along-track points of satellite(s) if being used for the twin experiments. Under this 250 circumstance, "O" represents "model that produces observations" and "(s)" represents the used satellite tracks (J2-Jason-2, J3-Jason-3, and SA-SARAL, for instance). Otherwise, in the real-data assimilation experiments, the subscript "O(s)" directly lists the satellites that measure the SWH. For example, a symbol named WW3SWAN(J2) ERAI means that the assimilation model is WW3 (here, "MA"="WW3") forced by ERA-Interim wind ("WF"="ERAI"), and the model producing observation is SWAN with 255 Jason-2 satellite track ("O"="SWAN", and "(s)"="(J2)").

Twin experiments
Twin experiments refer to a type of Observing System Simulation Experiment (OSSE), in which a model simulation is used to define the "true" solution of a data assimilation problem, and the other model simulation is used to start the assimilation. The "observations" are samples of the "truth" with 260 some white noise to simulate the observational errors. When the "truth" and assimilation are conducted by different (resp. identical) models, the framework is a "biased" (resp. "perfect") model twin experiment. Within a twin experiment framework, any aspect of assimilation skills can be measured as the degree to which the "truth" is recovered through the assimilation.

a) Perfect twin experiment 265
In a perfect twin experiment, we assume that the assimilation model and the observation are unbiased, i.e., both the instrument measuring and numerical modelling processes are sampling the same stochastic dynamical system. Such sampling only has random sampling errors without any systematic difference (bias). We can build this perfect model framework by using the same model to produce the "truth" as the assimilation model but with different initial conditions and wind forcings. 270 The "observations" from the observational time window (1 hour) centered at the assimilating time can be created by sampling the "truth" SWH with the tracks of the Jason-2, Jason-3 and SARAL satellites, which will cover the global area in 10 days. In this circumstance, if WW3 (resp. SWAN) is used as the assimilation model, the "truth" is produced by the same WW3 (resp. SWAN) model. In the assimilation, we may start the model with different initial conditions and/or wind forcings to examine 275 the influences of initial errors and wind forcing errors on the wave assimilation. Such a perfect twin experiment can be named WW3WW3(s) WF or SWANSWAN(s) WF .

b) Biased twin experiment
To study the impact of model errors on wave assimilation, we use two models to design a "biased" twin experiment. Again, due to the scattering nature of the observations, it is difficult to obtain a complete 280 picture of the model bias against the real world. Given the difference between the WW3 and SWAN models described in section 2.2, we use these two models and their assimilation systems here to simulate the model bias and examine its influences on the WDA. We use the ERA-Interim reanalysis wind to force the WW3 (resp. SWAN) to produce the "truth" and "observations" but use the SWAN (resp. WW3) assimilation system to assimilate the "observations." The degree to which the "truth" 285 produced by different model-based assimilation systems is recovered by assimilating the "observations" is an assessment of the model bias influences on the WDA. Such a "biased" twin experiment can be Under the biased twin experiment framework, we also conduct experiments to examine the impacts of

Real-data assimilation experiments 295
In this study, we also conduct real-data assimilation experiments using WW3 and SWAN assimilation systems with real track data from the Jason-2, Jason-3 and SARAL satellites. Through real-data assimilation experiments with different model-based assimilation systems, we can 1) increase our understanding of the influences of model errors on the WDA and 2) study the method to reduce the model error influences on the assimilation results. The real-data assimilation experiments can be 300 directly named, e.g., WW3J2+J3+SA WF or SWANJ2+J3+SA WF .

Influences of initial and wind forcing errors
Usually, wave numerical simulation can be improved by three methods: 1) reducing the errors in the initial conditions, 2) enhancing the accuracy of the wind forcing, and 3) improving the representation 305 of the wave model and its parameterization.
In this section, we use perfect model twin experiments (as described in section 2.4.1) to exclude model errors and explore the impact of wind forcings and initial conditions on the wave simulations. To compare the performances of the WW3 and SWAN models, we conduct separate experiments with these two models. The "truth" and model control runs are two basic experiments of the perfect twin 310 experiment framework. We use the ERA-Interim wind to drive WW3 (resp. SWAN) and generate a long time series of model states as the "truth," which is called WW3 ERAI (resp. SWAN ERAI ) for the WW3 (resp. SWAN) perfect model twin experiment. The "observations" are created by interpolating the corresponding "truth" SWH onto the along-track points of satellite orbits. Then, we use the NCEP-CFSR wind to force WW3 (resp. SWAN), called the model control WW3 CFSR (resp. SWAN CFSR ), and the 315 data assimilation is named WW3WW3(s) CFSR (resp. SWANSWAN(s) CFSR ). Starting from an independent initial condition produced by the model control, we can conduct the assimilation with the ERA-Interim or NCEP-CFSR wind forcing. The error verification of the assimilation results against the "truth" simulation compared to the error of the model control is an evaluation of the initial error or/and wind forcing error influences on the WDA. All perfect model twin experiments are listed in Table 1. 12 First, we conduct two sets of model control experiments WW3 CFSR and SWAN CFSR for 80 days (from December 2017 to February 2018). To explore the effect of the initial conditions, we perform the model spin-up for a long time to adequately reach a steady state. Then using the 45 th -day model states as the initial conditions, we conduct one more model simulation and data assimilation experiments for each model system as WW3 ERAI and WW3WW3(J2) ERAI as well as SWAN ERAI and SWANSWAN(J2) ERAI . The root 325 mean square errors (RMSEs) of these experiments against the "truth" are shown in Fig. 3 as the red (for WW3 CFSR +WW3 ERAI and SWAN CFSR +SWAN ERAI ) and pink (for WW3 CFSR +WW3WW3(J2) ERAI and The SWH RMSE is approximately .34 meters in the WW3 or SWAN model control with the NCEP-CFSR wind. Once the wind forcing is changed to the "perfect" wind (the ERA-Interim) on the 45 th day, 330 the RMSE quickly drops and is close to zero after approximately 10 days, and the SWAN model takes longer to accomplish this change than the WW3. If data assimilation is added, the RMSE reduces much faster than the model controls (roughly half of the time scale of the correct wind forcing). From the analyses above, we learned, 1) in wave models, the wind forcing plays an important role and an incorrect wind forcing could be a significant error source of WDA; and 2) the WDA can rapidly reduce 335 the initial error and improve the predictability of a wave model even when it is forced by an inaccurate wind forcing.

Impact of the observational system
In this section, using the same model states (at the 45 th day) in the corresponding model control as in respectively. However, when more satellite observations are assimilated into the model, the magnitude of improvement becomes small (further reduced by 8 % in WW3 and 9 % in SWAN when Jason-3 is added as well as only a 6 % in WW3 and 3 % in SWAN further reduction when SARAL is further added). These results suggest that given wind forcing errors, increasing observational information can help to improve the model behavior, but the improvement is limited. 355

Adverse impact of model bias
As described in section 2.2, the WW3 and SWAN models discretize the wave action governing equation with different physical processes, parameterization schemes and differencing schemes. These differences result in each wave model having its own distinguished characteristics. To study the adverse impact of the model bias on the wave assimilation, the biased twin experiments described in 360 section 2.4.1 are used in this section, where the "truth" model and the assimilation model are different between WW3 and SWAN. For example, the WW3SWAN(J2) ERAI (resp. SWANWW3(J2) ERAI ) experiment uses WW3 (resp. SWAN) as the assimilation model to assimilate the Jason-2 track point "observations," but the observed values are produced by SWAN (resp. WW3), and all models are forced by the ERA-Interim wind. All related experiments for the biased model framework are described in detail in Table  365 2.  Fig. 4) can significantly reduce the SWH simulation error (by 24 % and 22 %, respectively) and enhance the correlations (by 3 % and 4 %, respectively) with the "truth" (SWAN ERAI and WW3 ERAI , respectively). When the "observations" of Jason-3 and 380 SARAL are added to the assimilation (i.e., WW3SWAN(J2+J3) ERAI and WW3SWAN(J2+J3+SA) ERAI as well as SWANWW3(J2+J3) ERAI and SWANWW3(J2+J3+SA) ERAI ) (see the red and blue lines, respectively), the model SWH error (resp. correlation) is further reduced (resp. enhanced), but the amplitude of reduction (resp. enhancement) gradually diminishes (10 % and 5 % for further error reduction and 1 % and 0.8 % for further correlation enhancement in the WW3 assimilation; 10 % and 7 % for further error reduction and 385 1.7 % and 0.7 % for further correlation enhancement in SWAN assimilation from the additions of Jason-3 and SARAL, respectively).

WW3SWAN(J2) ERAI and SWANWW3(J2) ERAI (pink lines in
The results of two other sets of assimilation experiments called WW3SWAN(J2+J3+SA) ASSW and SWANWW3(J2+J3+SA) ASSW are also plotted by dotted green lines in Fig. 4. The superscript "ASSW" stands for the assimilation-corrected wind, meaning that the wind forcing of the assimilation model is also 390 "corrected" by the "observed" SWH data, as described in Step 4 of section 2.3. We found that in both assimilation systems, using the "observed" SWH data to "correct" the wind can compensate for the model errors to some degree and further reduce the assimilation errors, but the improvement is very limited. The weak improvement could be attributed to the simple wind correction method with total wave height. In the future, a more powerful correction method with windsea wave height may have 395 better performance.
These assimilation results clearly show that even though the wind forcing is perfect, once a biased assimilation model is used, the wave simulation has large errors. Although WDA can greatly reduce the simulation error by assimilating the observational information into the model, due to the existence of the model bias, the error remains at some significant level and cannot be eliminated entirely even by 400 increasing the observational constraints through an improvement in the observational system and constraint of the wind forcing.

Comparison of the influence of wind forcing with model bias
Comparing the time series of SWH RMSE caused by two important error sources, model bias (0.58 m shown with black lines of panel a/c in Fig. 4) plays a stronger role than wind forcing (0.34 m shown 405 with black lines in Fig. 3). When three satellite observations (Jason-2, Jason-3 and SARAL) are assimilated into twin experiments, the RMSE of SWH reduces 38% in perfect model twin experiments (cyan lines in Fig. 3) and 40% in biased model twin experiments (blue lines of panel a/c in Fig. 4). It is obvious that the error caused by model bias is bigger than by wind forcing and their improvements are almost similar after data assimilation. 410 experiments, here we only show the results using WW3 model as simulation/assimilation model). In model control run, it makes sense that the error caused by wind forcing (panel a (WW3 CFSR ). Truth is WW3 ERAI ) has distributed in the areas where the wind is strong, such as the area of Antarctic Circumpolar Current (ACC) and the north Pacific and Atlantic Ocean, while the error caused by model 415 bias (panel c (WW3 ERAI ). Truth is SWAN ERAI ) is distributed almost globally. After assimilating with three satellite observations, the error spatial distribution of both has improved a lot respectively (panel b (WW3WW3(J2+J3+SA) CFSR ), and panel d (WW3SWAN(J2+J3+SA) ERAI )). Referring to wind distribution, we can divide the global areas roughly into three parts: northern westerly zone (65° N-30° N), trade-wind zone (30° N-30° S), and circumpolar westerly zone (30° S-65° S). Therefore the error caused by wind 420 forcing (resp. model bias), the decreasing percentages of SWH RMSE in these three areas are 30% (resp. 27%), 45% (resp. 50%), and 46% (resp. 48%), respectively. We can find that the improvement of both error sources has similar performance in three these areas, weak in the northern westerly zone and almost same strong in the trade-wind zone and circumpolar westerly zone. About the reason why there is a lower improvement in the north Pacific and Atlantic Ocean should be explored in the future. To 425 sum up, the error caused by model bias has larger than wind forcing in the global area generally, especially in the equatorial ocean. However, in the north of Atlantic Ocean, wind forcing has a stronger impact. After data assimilation, both have improved greatly and have similar spatial pattern (i.e. the bigger error is still distributed in high latitude). However, their error gap is still existed.
It's worth to mention that there is a similar performance about the effect of observation system on 430 improving both error sources, whether in time series or spatial distribution. The more observation information absorbed into assimilation systems, the better error improvement in both twin experiments.
However, due to the existence of model bias, this improvement has a limitation and stays at some certain level. If more powerful observation is absorbed (such as: wave direction, wave period, and twodimensional wave spectra), the limitation maybe stay at a lower level. Next, with the results of real-435 data assimilation where both the model and wind forcing have errors, we will analyze and discuss how to mitigate the model bias influences on the WDA.

Bias characteristics of WW3 and SWAN data assimilations
From the above analyses of twin experiment results, we learned that the model bias has a strong 440 adverse impact on WDA. To explore the method of mitigating the model bias influence on the WDA, we conduct the real-data assimilation experiments (same time range as the "twin" experiments) described in section 2.4.2 using the WW3 and SWAN assimilation systems to assimilate the track data of Jason-2 and Jason-2+Jason-3+SARAL. To ensure the performance of the biased model WDA, a longer assimilation (more than two months) is conducted (a total of 70 days). The spatial distributions 445 of the SWH errors (obtained from the difference against the merged gridded AVISO observations over the last 30 days out of 80 days) are shown in Fig. 6. The left (resp. right) panels are for the WW3 (resp. SWAN) simulation and assimilations: WW3 ERAI , WW3J2 ERAI , and WW3J2+J3+SA ERAI (resp. SWAN ERAI , SWANJ2 ERAI , and SWANJ2+J3+SA ERAI ). Fig. 6 reveals that a large difference exists in the simulations of the 450 two models. First, the SWAN simulation errors are generally larger than the WW3 simulation errors.

Comparing panel a with panel d in
Second, the global error distributions are quite different: while the WW3 simulation errors appear negative (resp. positive) over most of the 30° S north (resp. south) area, the SWAN simulation errors appear positive in most of the tropical oceans and negative in the middle latitudes. Both simulations show large errors in the southern ocean coastal area, but over the Antarctic Circumpolar Current area, 455 the WW3 error is positive, and the SWAN error is negative. It is interesting that under same wind conditions, two wave models have converse performances. In the future study, it is urgent to find out the detailed approach of two wave models in this area.
The above systematic differences between the two model simulations have significant influences on the results of the WDA. In general, the distribution of assimilation errors shares the same patterns as the 460 model simulation errors but with a much smaller magnitude. The net result is that both the WW3 negative (resp. positive) error magnitude over the 30 o S north (resp. south) area and the SWAN error magnitude as (+)(-)(+)(-) from south to north are dramatically reduced by the Jason-2 data assimilation (comparing the middle panels with the upper panels), and on this basis, incorporating more observations from Jason-3 and SARAL into the assimilation process, both model error magnitudes are 465 further reduced to some degree (comparing the bottom panels with the middle panels). From the corresponding RMSE distributions (it's not shown here), we learned that the large RMSEs mainly appear in places where the model bias (time mean error) is large. This finding means that the model bias has a largely adverse impact on the WDA. respectively. In these real-data assimilation cases, for each model assimilation, both the model and wind forcing have errors. Under this circumstance, the assimilation where the SWH observations are used to adjust the model spectrum, the model wind forcing is also corrected and can further reduce the 480 assimilation errors (red lines in Fig. 7). The red lines in Fig. 7 represent the best result of the assimilation given the WW3 and SWAN model biases, which makes full use of the observations from all three satellites to adjust both the model spectrum and wind forcing. Next, we will discuss how to use the results of two assimilation systems to mitigate the wave analysis error.

Mitigation of WDA errors 485
The mitigation of model bias is a complex issue in which improving the model is a final but longlasting solution. From Fig. 6, we learned that the WW3 and SWAN assimilation errors have some common (or opposite) characteristics in some locations. For example, while the SWH over the southern ocean coastal area always appears to be overestimated because of the lack of adequate observations to improve in both assimilation systems, the WW3 and SWAN assimilations appear to be the opposite in 490 the Antarctic Circumpolar Current area and the tropical oceans. The WW3 (resp. SWAN) assimilation errors in the Antarctic Circumpolar Current area appear positive (resp. negative), while the WW3 (resp. SWAN) assimilation errors in the tropical oceans appear negative (resp. positive). A question arises: as the first step of mitigating model bias influences on the WDA, can we use a pair of assimilation systems to explore a statistical approach to reduce the wave assimilation errors? 495 Given the opposite behaviors of two assimilation systems existing in certain places, the simplest bias correction could be conducted by simple average. This assumes that each wave model assimilation system has its own characteristics of systemic error due to deficit physics, and these systemic errors (or model biases) from different wave models follow a Gaussian distribution with trivial expectation. The corresponding results are shown in Fig. 8. Compared with the performance of each individual 500 assimilation system (dashed red lines for WW3 and dotted red lines for SWAN), the results of this bias correction (cyan lines) show that the RMSE is reduced (the left panel) but the spatial correlation is not greatly improved (the right panel). It is reasonable that based on the opposite errors deviating from the real world in two assimilation systems, this correction method employing the mathematical average can reduce the RMSE to some extent, but it may not have a significant contribution to improving the 505 correlation coefficient if either the sampling size of model bias is too small (only 2 in this case) or the bias has an asymmetric distribution.
Considering the potentially asymmetry of Gaussian distribution of different model biases and small sampling size in practice, as the first step, we calculate the spatial distribution of model bias in every assimilation system (the time mean of the difference between observations and assimilation results) 510 ( Fig. 6) and extract it at each time step in the output, and then calculate the expectation (average) of all assimilation systems as the results after bias correction. The corresponding results are shown with pink lines in Fig. 8. Both the RMSE and correlation are improved greatly. We also show the spatial distribution of SWH RMSEs after bias correction with the second method. From Fig. 9, we can easily find that after bias correction, all the experiments have similar error performances, the smaller in low 515 latitude and the larger in high latitude. As the increasing of satellite observation (one in panel bf, and three in panel cg), the error decreases gradually referred to model control run (panel ae). Based on the best assimilation results currently (panel cg), the improvement with a corrected wind is more efficient in WW3 (panel d), especially in high latitude which is the higher error area, but is hardly to find in SWAN (panel h). If we focus on the change without (panel c) and with (panel d) the corrected wind in 520 WW3 model, the RMSE improvement of windsea and swell is 27% and 1% averaged in global. If we divide global area into three parts mentioned as section 3.4, the windsea (resp. swell) improvement of regional RMSE mean is 4% (resp. 3%), 3% (resp. 1%), and 6% (resp. 1%) in northern westerly zone (65° N-30° N), trade-wind zone (30° N-30° S), and circumpolar westerly zone (30° S-65° S). It is obvious that the improvement of windsea is better than swell and wind correction has a stronger impact 525 in windsea wave assimilation at high latitude with strong wind. Clearly, this bias correction with physical consideration is more effective to improve the quality of WDA, but it uses observational information one more time, while the first method of bias correction processes assimilation results directly without further uses of observational information.
To verify the feasibility and applicability of the bias correction method above, 3 well-known wave 530 models (WW3, SWAN and WAM) with the same data assimilation method are used to conduct longer assimilation and bias correction experiments. The calculation period lasts for 14 months (from November 2016 to December 2017) with sufficient spin-up process to reach a steady assimilation state (the 1st month for model spin-up and the 2nd month for assimilation spin-up). The results of the last 12 months (for 2017) are analyzed and presented in Fig. 10 and Fig. 11 for the spatial distributions and 535 time series of RMSEs and spatial correlation coefficients, respectively. From Fig. 10, we can see that both the RMSE and correlation coefficient (panels d and h, respectively) have been improved by the bias correction that combines the advantages of every WDA system (panels a and e for WW3, panels b and f for SWAN, panels c and g for WAM). In Fig. 11, the bias correction of model control runs (the second bias correction method is conducted based on the result of model control run from three wave 540 models) shows improvement but is worse than the data assimilation before bias correction (compare green with pink). Compared with the model control (blue), the assimilation results with bias correction (red) can reduce the error by 25 % and significantly enhance the correlation coefficient (from 0.88 to 0.923). This result confirms that this bias correction based on multiple assimilation systems can effectively enhance the WDA quality. 545

Summary and discussion
Ocean waves cause the sea surface roughness to impact the boundary conditions of the atmosphere and the wind stress of the ocean surface. Wave processes, such as wave-breaking, wave-induced bottom stress and so on, have significant effects on ocean mixing. Thus, ocean waves are important physical processes for understanding ocean mixing and air-sea interactions in coupled Earth systems. More 550 accurately predicting ocean waves is of great societal significance. However, multiple error sources exist in wave simulations and predictions, including modelling errors, wind forcing errors and initial condition errors.
To sort out the source of the errors of wave data assimilation (WDA), a pair of independent WDA systems is first developed using two wave models: Wave Watch III (WW3) and Simulating WAves 555 Nearshore (SWAN). The perfect and biased model "twin" experiment frameworks are designed to clearly identify each error source and examine its influences on WDA. The results show that model bias is a significant error source that has a largely adverse impact. Then, two WDA systems are used to design bias correction approaches to mitigate the influences of model bias and improve the assimilation quality. Finally, long-term WDA experiments added by the third WDA system with the WAM model 560 (WAve Modeling) (WW3, SWAN and WAM) are conducted to validate the bias correction method.
Three findings are established: 1) When the model is perfect, the initial condition error decays within 10 days, but the WDA can shorten the time scale by half. 2) When the model is biased, despite a perfect wind forcing, the wave simulation has large errors and the WDA can only reduce the error to a limited extent. 3) With the results from two assimilation systems, a statistical approach of bias 565 correction significantly improves the quality of final wave analysis by combining the merits from individual assimilation systems.
Model bias is an obstacle to improving WDA and wave predictions. Using multiple assimilation systems to study the influences of model bias on WDA is an effective approach. As the first step, however, we have used a simple assimilation scheme and simple bias correction method. In follow-up 570 studies, we shall consider powerful observation information (such as: 2-dimensional wave spectra), advanced assimilation schemes (such as ensemble Kalman Filter), and more comprehensive correction methods to help improve modelling. For example, the "online" bias correction (the "offline" bias correction is used in this paper) during the assimilation process (e.g., Dee, 2005) will be considered to improve the assimilation results and correct the instantons initial condition within individual 575 assimilation systems, after that, a robust observation (such as buoy) is needed to validation the quality of bias correction. In addition, improving the model is an important, inevitable and long-lasting task. In this study, under same wind condition, we find that three models show common bias characteristics in the Antarctic Circumpolar Current (ACC) area. If there is a similar performance forcing by other wind, this may suggest that present wave modelling may have deficits in energy spectrum expression for high 580 wind speed areas. In the future, we will further examine the sensitivities of physical processes on high wind speed to mitigate such common modelling bias. All in all, a robust bias correction method with lower model bias and higher representation of wave physical characteristics may further improve wave analysis quality. Once a long time series of high-quality wave analyses is available, it is expected that we can improve our understanding of ocean mixing. The physical process of wave-induced mixing is 585 linked with the structure of the ocean mixing layer (Qiao et al., 2010). This process can be expressed as a function of wave number, frequency and wave spectrum and so on, provided by wave analysis. With the framework of multiple WDA systems developed in this study, improved wave predictions can be effectively pursued. How can we further enhance the predictability of ocean waves? The first important step is to understand the physical process of ocean waves better based on a more accurate evolution of 590 wave state from this framework. Answering these questions could be very important and interesting research topics for the future studies.

Code and data availability
Codes, data and scripts used to run the models and produce the figures in this work are available on the Zenodo site: doi:10.5281/zenodo.3445580 or by sending a written request to the corresponding author 595 (Shaoqing Zhang, szhang@ouc.edu.cn).    795 satellites, as well as the assimilation with corrected wind (dotted green, denoted as WW3SWAN(J2+J3+SA) ASSW and SWANWW3(J2+J3+SA) ASSW ) against the "truth" (same as in Fig. 3 but for SWAN and WW3 with the ERA-Interim wind). The numbers in the parentheses correspond to the globally averaged RMSE (in panels a and c) and spatial correlation coefficient (in panels b and d) over the last 30 days during the assimilation period. The "observed" data are produced by projecting the "truth" SWH onto the satellite 800 orbit.