<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0" article-type="research-article">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">GMD</journal-id><journal-title-group>
    <journal-title>Geoscientific Model Development</journal-title>
    <abbrev-journal-title abbrev-type="publisher">GMD</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Geosci. Model Dev.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1991-9603</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/gmd-15-2183-2022</article-id><title-group><article-title>DINCAE 2.0: multivariate convolutional neural network with error estimates to reconstruct sea surface temperature satellite and altimetry observations</article-title><alt-title>Reliable error estimates for reconstructed missing data in satellite observations</alt-title>
      </title-group><?xmltex \runningtitle{Reliable error estimates for reconstructed missing data in satellite observations}?><?xmltex \runningauthor{A. Barth et al.}?>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes">
          <name><surname>Barth</surname><given-names>Alexander</given-names></name>
          <email>a.barth@uliege.be</email>
        <ext-link>https://orcid.org/0000-0003-2952-5997</ext-link></contrib>
        <contrib contrib-type="author" corresp="no">
          <name><surname>Alvera-Azcárate</surname><given-names>Aida</given-names></name>
          
        <ext-link>https://orcid.org/0000-0002-0484-4791</ext-link></contrib>
        <contrib contrib-type="author" corresp="no">
          <name><surname>Troupin</surname><given-names>Charles</given-names></name>
          
        <ext-link>https://orcid.org/0000-0002-0265-1021</ext-link></contrib>
        <contrib contrib-type="author" corresp="no">
          <name><surname>Beckers</surname><given-names>Jean-Marie</given-names></name>
          
        </contrib>
        <aff id="aff1"><institution>GHER, University of Liège, Liège, Belgium</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Alexander Barth (a.barth@uliege.be)</corresp></author-notes><pub-date><day>15</day><month>March</month><year>2022</year></pub-date>
      
      <volume>15</volume>
      <issue>5</issue>
      <fpage>2183</fpage><lpage>2196</lpage>
      <history>
        <date date-type="received"><day>18</day><month>October</month><year>2021</year></date>
           <date date-type="rev-request"><day>15</day><month>November</month><year>2021</year></date>
           <date date-type="rev-recd"><day>10</day><month>February</month><year>2022</year></date>
           <date date-type="accepted"><day>17</day><month>February</month><year>2022</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2022 Alexander Barth et al.</copyright-statement>
        <copyright-year>2022</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022.html">This article is available from https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022.html</self-uri><self-uri xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022.pdf">The full text article is available as a PDF file from https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022.pdf</self-uri>
      <abstract><title>Abstract</title>

      <p id="d1e105">DINCAE (Data INterpolating Convolutional Auto-Encoder) is a neural network used to reconstruct missing data (e.g., obscured by clouds or gaps between tracks) in satellite data. Contrary to standard image reconstruction (in-painting) with neural networks, this application requires a method to handle missing data (or data with variable accuracy) already in the training phase. Instead of using a standard L2 (or L1) cost function, the neural network (U-Net type of network) is optimized by minimizing the negative log likelihood assuming a Gaussian distribution (characterized by a mean and a variance). As a consequence, the neural network also provides an expected error variance of the reconstructed field (per pixel and per time instance).</p>

      <p id="d1e108">In this updated version DINCAE 2.0, the code was rewritten in Julia and a new type of skip connection has been implemented which showed superior performance with respect to the previous version. The method has also been extended to handle multivariate data (an example will be shown with sea surface temperature, chlorophyll concentration and wind fields). The improvement of this network is demonstrated for the Adriatic Sea.</p>

      <p id="d1e111">Convolutional networks work usually with gridded data as input. This is however a limitation for some data types used in oceanography and in Earth sciences in general, where observations are often irregularly sampled.  The first layer of the neural network and the cost function have been modified so that unstructured data can also be used as inputs to obtain gridded fields as output. To demonstrate this, the neural network is applied to along-track altimetry data in the Mediterranean Sea. Results from a 20-year reconstruction are presented and validated. Hyperparameters are determined using Bayesian optimization and minimizing the error relative to a development dataset.</p>
  </abstract>
    </article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d1e123">Ocean data are generally sparse and inhomogeneously distributed. The data coverage often contains large gaps in space and time. This is in particular the case with in situ observations. Satellite remote sensing only measures the surface of the ocean but generally has better spatial coverage than in situ observations. However, still about 75 % of the ocean surface is on average covered by clouds that block sensors in the optical and infrared bands <xref ref-type="bibr" rid="bib1.bibx35" id="paren.1"/>. Given the sparsity of data, it is natural to aim to combine data representing different parameters as, e.g., mesoscale flow structures are often visible in all ocean tracers.</p>
      <p id="d1e129">Prior work on using multivariate data in connection with satellite data use, for example,
empirical orthogonal functions (EOF), which can be naturally extended to multivariate datasets as long as an appropriate norm is defined. For example, <xref ref-type="bibr" rid="bib1.bibx1" id="text.2"/> uses
sea surface temperature, chlorophyll and wind satellite fields with
data interpolating empirical orthogonal functions (DINEOF).
Multivariate EOFs have also been used to project surface observations to deeper layers
<xref ref-type="bibr" rid="bib1.bibx21" id="paren.3"/> or to derive nitrate maps in the Southern Ocean
<xref ref-type="bibr" rid="bib1.bibx16" id="paren.4"/>. In the latter case, EOFs linking salinity, and potential temperature and nitrate concentrations are derived from model simulations.
<?xmltex \hack{\newpage}?></p>
      <p id="d1e142">As some observations can be measured at much high spatial resolution via remote sensing (in particular the resolution of sea surface temperature is much higher than the resolution of sea surface salinity products), “multifractal fusion techniques” are used to improve remote sensed surface salinity estimates using sea surface temperature. Data fusion is implemented as a locally weighted linear regression <xref ref-type="bibr" rid="bib1.bibx25 bib1.bibx26" id="paren.5"/>.
<xref ref-type="bibr" rid="bib1.bibx10" id="text.6"/> also used an earlier version of the DINCAE code to estimate sea surface chlorophyll using additional sea surface temperature observations.</p>
      <p id="d1e151">The structure of a neural network, and in particular its depth, is uncertain and to some degree dependent on the used data set. We also investigate the influence of the depth of the neural networks in this work. It is known that neural networks are increasingly more difficult to train as their depth increases because of the well-known vanishing gradient problem <xref ref-type="bibr" rid="bib1.bibx13" id="paren.7"/>: the derivative of the loss function relative to the weights of the first layers of a neural network has the tendency to either decrease (or increase) exponentially with a increasing number of layers. This prevents effective optimization (training) of these layers using gradient-based optimization methods.</p>
      <p id="d1e158">Several methods have been proposed in the literature to mitigate such problems using alternative neural network architectures. In the context of the present manuscript, skip connections in the form of residual layers have been tested (similar to residual networks; <xref ref-type="bibr" rid="bib1.bibx11" id="altparen.8"/>). The derivative of the loss function relative to the weights in such layers remains (at least initially) closer to one, so there is a more direct relationship between the loss function and the weights and biases to be optimized. Deeper residual networks include shallower networks as a special case and thus, as per their construction, should perform at least as well as shallower networks.</p>
      <p id="d1e164">The gradient of a whole network is computed via back-propagation, which is essentially based on the repeated application of the chain rule for differentiation. The information of the observation is injected via the loss function and propagated backward in a way which is similar to the 4D-var backward in time integration of the adjoint model. Another interesting neural network architecture has been proposed in the form of the Inception network
<xref ref-type="bibr" rid="bib1.bibx31" id="paren.9"/>, where the output of intermediate layers, here in the form of a preliminary reconstruction, are used in the loss function (in addition to the output of the final layer). The result is that the information of the observations are injected not only at the final layer but also in the intermediate layer, which also contributes to reducing the vanishing gradient problem.</p>
      <p id="d1e170">While for gridded satellite data, approaches based on empirical orthogonal functions and convolutional neural networks have been shown the be successful, it is difficult to apply similar concepts to non-gridded data as these methods typically require a stationary grid. Another objective of this paper is to show how convolutional neural networks can be used on non-gridded data. This approach is illustrated with altimetry observations.</p>
      <p id="d1e173">The objective of this manuscript is to highlight the improvement of DINCAE relative to the previously published version <xref ref-type="bibr" rid="bib1.bibx5" id="paren.10"/>. The Sect. <xref ref-type="sec" rid="Ch1.S2"/> presents the updated structure of the neural network. The gridded and non-gridded observations used here are presented in Sect. <xref ref-type="sec" rid="Ch1.S3"/>. Details of the implementation are also given (Sect. <xref ref-type="sec" rid="Ch1.S4"/>). The results and conclusions are presented in Sects. <xref ref-type="sec" rid="Ch1.S5"/> and <xref ref-type="sec" rid="Ch1.S6"/>.</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>The neural network architecture</title>
      <p id="d1e198">The DINCAE network <xref ref-type="bibr" rid="bib1.bibx5" id="paren.11"/> is a neural network composed of an encoder and decoder network. The encoder uses the original gappy satellite data (with additional metadata as explained later) and the decoder uses the output of the encoder to reconstruct the full data image (along with an error estimate). The encoder uses a series of convolutional layers followed by max pooling layers, reducing the resolution of the datasets. The decoder does essentially the reverse operation by using convolutional and interpolation layers.  This is the general structure of a convolutional autoencoder and the classical U-Net networks <xref ref-type="bibr" rid="bib1.bibx29" id="paren.12"/>. In the following section we will discuss the main components of the DINCAE neural network used for the different test cases and emphasize the changes relative to the previous version.</p>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Skip connections</title>
      <p id="d1e214">In an autoencoder, the inputs are compressed by forcing the data flow through a bottleneck, which ensures that the neural network must efficiently compress and decompress the information. However, in U-Net <xref ref-type="bibr" rid="bib1.bibx29" id="paren.13"/> and DINCAE, skip connections are implemented, allowing the information flow of the network to partially bypass the bottleneck to prevent the loss of small-scale details in the reconstruction. Skip connections can be realized either by concatenating tensors along the feature map dimension (as it is done in U-Net) or by summing the tensors. The cat skip connections at the step <inline-formula><mml:math id="M1" display="inline"><mml:mi>l</mml:mi></mml:math></inline-formula> in the autoencoder can be written as the following operation:
            <disp-formula id="Ch1.E1" content-type="numbered"><label>1</label><mml:math id="M2" display="block"><mml:mrow><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mi mathvariant="normal">cat</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M3" display="inline"><mml:mi mathvariant="normal">cat</mml:mi></mml:math></inline-formula> concatenates two 3D arrays along the dimension representing the features channels. The function <inline-formula><mml:math id="M4" display="inline"><mml:mrow><mml:msup><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is a sequence of neural network layers applied to the array of <inline-formula><mml:math id="M5" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>  (produced by previous network layers).  Sum skip connections are implemented as
            <disp-formula id="Ch1.E2" content-type="numbered"><label>2</label><mml:math id="M6" display="block"><mml:mrow><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d1e393">Clearly, the output of a cat skip connection has a size twice as large as the output of a sum skip connection. These skip connections are followed by a convolutional layer, which ensures that the number of output features are the same for both types of skip connection.
In fact, one can show that the sum skip connection (followed by a convolution layer) is formally a special case of the cat skip connection.
However, sum skip connections can be advantageous because the weight and bias of the convolutional layers are more directly related to the output of the neural network, which helps to reduce the “vanishing gradient problem” <xref ref-type="bibr" rid="bib1.bibx11" id="paren.14"/>.</p>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>Refinement step</title>
      <p id="d1e407">The whole neural network can be described as two functions that provide the input variable product between the reconstruction <inline-formula><mml:math id="M7" display="inline"><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> and its expected error variance <inline-formula><mml:math id="M8" display="inline"><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> for every grid cell. The loss function is derived from the negative log-likelihood of a Gaussian with mean <inline-formula><mml:math id="M9" display="inline"><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> and variance <inline-formula><mml:math id="M10" display="inline"><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>:
            <disp-formula id="Ch1.E3" content-type="numbered"><label>3</label><mml:math id="M11" display="block"><mml:mtable class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mi>J</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:mfenced open="[" close=""><mml:mrow><mml:msup><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:mfenced></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mfenced open="" close="]"><mml:mrow><mml:mo>+</mml:mo><mml:mi>log⁡</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mi>log⁡</mml:mi><mml:mo>(</mml:mo><mml:msqrt><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">π</mml:mi></mml:mrow></mml:msqrt><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
          where the sum with the spatial indices <inline-formula><mml:math id="M12" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M13" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula> runs over all grid points with valid (i.e., non-masked) values and <inline-formula><mml:math id="M14" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> is the number of valid values. The first term of the right-hand side of the equation is the mean square error, but scaled by the estimated error standard deviation, the second term penalizes any overestimation of the error standard deviation and the third term is constant and can be neglected as it does not influence the gradient of the cost function.
For a convolutional auto-encoder with refinement, the intermediate outputs <inline-formula><mml:math id="M15" display="inline"><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> and <inline-formula><mml:math id="M16" display="inline"><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> are concatenated with the inputs and passed through another auto-encoder with the same structure (except for the number of filters for the input layer, which has to accommodate the two additional fields corresponding to <inline-formula><mml:math id="M17" display="inline"><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> and <inline-formula><mml:math id="M18" display="inline"><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>). The weights of the first and second auto-encoder are not related. The final cost function with refinement <inline-formula><mml:math id="M19" display="inline"><mml:mrow><mml:msub><mml:mi>J</mml:mi><mml:mi>r</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is given by
            <disp-formula id="Ch1.E4" content-type="numbered"><label>4</label><mml:math id="M20" display="block"><mml:mrow><mml:msub><mml:mi>J</mml:mi><mml:mi>r</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mi mathvariant="italic">α</mml:mi><mml:mi>J</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="italic">α</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mi>J</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mo>′</mml:mo></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mover accent="true"><mml:mrow><mml:msup><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow><mml:mo mathvariant="normal" stretchy="true">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M21" display="inline"><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M22" display="inline"><mml:mrow><mml:msup><mml:msup><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>′</mml:mo></mml:msup><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> are the reconstruction and its expected error variance produced by the second auto-encoder. The weights <inline-formula><mml:math id="M23" display="inline"><mml:mi mathvariant="italic">α</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M24" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="italic">α</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> control how much importance is given to the intermediate output (relative to the final output).</p>
      <p id="d1e829">With a refinement step, the neural network becomes essentially twice as deep and the number of parameters (approximately) doubles. The increased depth would make it prone to the vanishing gradient problem. However, by including the intermediate results in the cost function, this problem is reduced. In fact, information from the observations is injected during back-propagation by the loss function. Due to the refinement step and the loss function, which also depends on the intermediate result, the information from the observation is injected at the last layer and at the middle layer of the combined neural network <xref ref-type="bibr" rid="bib1.bibx31" id="paren.15"/>. The relationship between the first layers of the neural network and the cost function is therefore more direct, which helps in the training of these first layers.</p>
      <p id="d1e835">The refinement step has been used in image in-painting for a computer vision application
<xref ref-type="bibr" rid="bib1.bibx17" id="paren.16"/> and it has also been applied for oceanographic data for tide gauge data <xref ref-type="bibr" rid="bib1.bibx37" id="paren.17"/>. In the present work, only one refinement step is tested, but the code supports an arbitrary number of sequential refinement steps.</p>
</sec>
<sec id="Ch1.S2.SS3">
  <label>2.3</label><title>Multivariate reconstructions</title>
      <p id="d1e852">Auxiliary satellite data (with potentially missing data) can be provided for the reconstruction. The handling of missing data in these auxiliary data is identical to the way missing data are treated for the primary variable. For every auxiliary satellite data, the average over time is first removed. The auxiliary data (divided by its corresponding error variance) and the inverse of the error variance are provided as input. Where data are missing, the corresponding input values are set to zero representing an infinitely large error (as a consequence of the chosen scaling). Multiple time instances centered around a target time can be provided as input.</p>
</sec>
<sec id="Ch1.S2.SS4">
  <label>2.4</label><title>Non-gridded input data</title>
      <p id="d1e864">Current satellite altimetry mission measures sea surface height along the ground track of the satellite. Satellite altimetry can measure through clouds but the data are only available along a collection of tracks. In order to better handle such data sets, we extended DINCAE to handle unstructured data as input.</p>
      <p id="d1e867">The first layer in DINCAE is a convolutional layer, which typically requires a field discretized on a rectangular grid. The convolutional layer can be seen as the discretized version of the following integral:
            <disp-formula id="Ch1.E5" content-type="numbered"><label>5</label><mml:math id="M25" display="block"><mml:mrow><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∫</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="normal">Ω</mml:mi><mml:mi>w</mml:mi></mml:msub></mml:mrow></mml:munder><mml:mi>w</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi>x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi>y</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo><mml:mi mathvariant="normal">d</mml:mi><mml:msup><mml:mi>x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:msup><mml:mi>y</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M26" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula> is the input field, <inline-formula><mml:math id="M27" display="inline"><mml:mi>w</mml:mi></mml:math></inline-formula> are the weights in the convolution (also called convolution kernel), <inline-formula><mml:math id="M28" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="normal">Ω</mml:mi><mml:mi>w</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the support of the function <inline-formula><mml:math id="M29" display="inline"><mml:mi>w</mml:mi></mml:math></inline-formula> (i.e., the domain where <inline-formula><mml:math id="M30" display="inline"><mml:mi>w</mml:mi></mml:math></inline-formula> is different from zero) and <inline-formula><mml:math id="M31" display="inline"><mml:mi>g</mml:mi></mml:math></inline-formula> the output of the convolutional layer. To discretize the integral, the continuous function <inline-formula><mml:math id="M32" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula> is replaced by a sum of Dirac functions using the values <inline-formula><mml:math id="M33" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> defined on a regular grid:
            <disp-formula id="Ch1.Ex1"><mml:math id="M34" display="block"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mi mathvariant="italic">δ</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>-</mml:mo><mml:mi>i</mml:mi><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>-</mml:mo><mml:mi>j</mml:mi><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>y</mml:mi><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d1e1096">In this case, the continuous convolution becomes the standard discrete convolution as used in neural networks. The weights <inline-formula><mml:math id="M35" display="inline"><mml:mrow><mml:mi>w</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> only need to be known at the discrete locations defined by the underlying grid.</p>
      <p id="d1e1117">For data points which are not defined on a regular grid we essentially use a similar approach. The function <inline-formula><mml:math id="M36" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula> is again written as a sum of Dirac functions:
            <disp-formula id="Ch1.Ex2"><mml:math id="M37" display="block"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mi>k</mml:mi></mml:munder><mml:msub><mml:mi>f</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mi mathvariant="italic">δ</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where now <inline-formula><mml:math id="M38" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the values at the locations <inline-formula><mml:math id="M39" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, which can be arbitrary. In order to evaluate the integral (Eq. <xref ref-type="disp-formula" rid="Ch1.E5"/>), it is necessary to know the weights at the location <inline-formula><mml:math id="M40" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The weights are still discretized on a regular grid, but are interpolated bilinearly to the data location to evaluate the integral. In fact, instead of interpolating the weights <inline-formula><mml:math id="M41" display="inline"><mml:mi>w</mml:mi></mml:math></inline-formula>, one can also apply the adjoint of the linear interpolation to <inline-formula><mml:math id="M42" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> (which is mathematically equivalent). This has the benefit that the computation of the convolution can be implemented using the optimized functions in the neural network library.</p>
      <p id="d1e1263">For data defined on a regular grid, it has been verified numerically that this proposed approach and the traditional approach used to compute the convolution give the same results.</p>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Data</title>
      <p id="d1e1275">The improvements are examined in two test cases. For multivariate gridded data, the approach is tested with sea surface temperature, chlorophyll and winds on the Adriatic Sea; for non-gridded data altimetry, observations of the whole Mediterranean Sea were used. As the altimetry observations do not resolve as many small scales as sea surface temperature, a larger domain was chosen for the altimetry test case.</p>
<sec id="Ch1.S3.SS1">
  <label>3.1</label><title>Gridded data (Adriatic Sea)</title>
      <p id="d1e1285">As the previous application <xref ref-type="bibr" rid="bib1.bibx5" id="paren.18"/> considered relatively low-resolution AVHRR data, we used more modern and higher-resolution satellite data in this application for the Adriatic Sea.
The datasets used include the following.</p>
      <p id="d1e1291"><list list-type="bullet">
            <list-item>

      <p id="d1e1296">Sea Surface Temperature (MODIS Terra Level 3 SST Thermal IR Daily 4km Nighttime v2014.0,  <ext-link xlink:href="https://doi.org/10.5067/MODST-1D4N4" ext-link-type="DOI">10.5067/MODST-1D4N4</ext-link>, <xref ref-type="bibr" rid="bib1.bibx24" id="altparen.19"/>) made available by PO.DAAC (<uri>https://podaac.jpl.nasa.gov/</uri>, last access: 10 February 2022, JPL, NASA, USA).</p>
            </list-item>
            <list-item>

      <p id="d1e1311">Wind speed (Cross-Calibrated Multi-Platform, CCMP; gridded surface vector winds) made available from Remote Sensing Systems (<uri>http://www.remss.com/measurements/ccmp/</uri>,  last access: 19 July 2019, <xref ref-type="bibr" rid="bib1.bibx34" id="altparen.20"/>). These datasets are described in
<xref ref-type="bibr" rid="bib1.bibx2" id="text.21"/>, <xref ref-type="bibr" rid="bib1.bibx19" id="text.22"/> and <xref ref-type="bibr" rid="bib1.bibx34" id="text.23"/>. This dataset has a 6 h temporal resolution and a <inline-formula><mml:math id="M43" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:math></inline-formula><inline-formula><mml:math id="M44" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> spatial resolution . The wind fields are averaged as daily mean fields.</p>
            </list-item>
            <list-item>

      <p id="d1e1353">Chlorophyll <inline-formula><mml:math id="M45" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> from Ocean Biology Processing Group, <xref ref-type="bibr" rid="bib1.bibx22" id="text.24"/>,
<uri>https://oceancolor.gsfc.nasa.gov/data/overview/</uri> (last access: 18 September 2019) at a 4 km resolution and L3 processing level.</p>
            </list-item>
          </list></p>
      <p id="d1e1371">The data sets span the time period 1 January 2003 to 31 December 2016. They are all interpolated (using bi-linear interpolation) on the common grid defined by the SST fields.</p>
      <p id="d1e1374">As ocean mixing reacts to the averaged effect of the wind speed (norm of the wind vector), we also smoothed the speed with a Laplacian filter using a time period of 2.2 d and a lag of 4 d (wind speed preceding SST). The optimal lag and time period were obtained by maximizing the correlation between the smoothed wind field and SST from the training data.</p>
</sec>
<sec id="Ch1.S3.SS2">
  <label>3.2</label><title>Non-gridded data (Mediterranean Sea)</title>
      <p id="d1e1385">Altimetry data from 1 January 1993 to 13 May 2019 covering 7<inline-formula><mml:math id="M46" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> W to 37<inline-formula><mml:math id="M47" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> E and from 30 to 46<inline-formula><mml:math id="M48" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> N from 22 satellite missions operating during this time frame are used. This domain essentially contains the Mediterranean Sea but also a small part of the Bay of Biscay and the Black Sea. In preliminary studies we found that including the data from adjacent seas can help neural networks better generalize and prevent overfitting (e.g., <xref ref-type="bibr" rid="bib1.bibx9" id="altparen.25"/>) as the neural network is confronted with a more diverse set of conditions.
The data (SEALEVEL_EUR_PHY_L3_MY_OBSERVATIONS_008<?xmltex \notforhtml{\newline}?>_061, accessed on 13 October 2020) are made available by the Copernicus Marine Environment Monitoring Service (CMEMS).</p>
      <p id="d1e1420">These data were split along the following fractions:
<list list-type="bullet"><list-item>
      <p id="d1e1425">70 % training data,</p></list-item><list-item>
      <p id="d1e1429">20 % development data,</p></list-item><list-item>
      <p id="d1e1433">10 % test data.</p></list-item></list></p>
      <p id="d1e1436">To reduce the correlation between the different datasets, satellite tracks are not split and belong entirely to one of these three datasets.</p>
      <p id="d1e1439">Some experiments of the reconstructed altimetry use gridded sea surface temperature satellite observations as an auxiliary dataset for multivariate reconstruction. We use the AVHRR_OI-NCEI-L4-GLOB-v2.0 datasets <xref ref-type="bibr" rid="bib1.bibx27 bib1.bibx23" id="paren.26"/> because it is a single consistent dataset covering the full time period of the altimetry data and because it matches approximately the altimetry dataset in terms of resolved spatial scales.</p>
</sec>
</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Implementation</title>
      <p id="d1e1454">Python code was first ported from TensorFlow 1.12 to 1.15, reducing the training time from 4.5 to 3.5 h using a GeForce GTX 1080 GPU and Intel Core i7-7700 CPU. We also considered porting DINCAE to TensorFlow 2. The TensorFlow 2 programming interface is however quite different from previous versions. As our group gained familiarity with the Julia programming language <xref ref-type="bibr" rid="bib1.bibx7" id="paren.27"/>, we decided to rewrite DINCAE in Julia. Porting DINCAE to Julia with the package <italic>Knet</italic> <xref ref-type="bibr" rid="bib1.bibx36" id="paren.28"/> cut down the runtime from 3.5 to 1.9 h (thanks to more efficient data transformation) using the AVHRR dataset described in <xref ref-type="bibr" rid="bib1.bibx5" id="text.29"/>, and using a concatenation skip connection in both cases.</p>
      <p id="d1e1469">For the Adriatic test case, the input is a 3D array with the dimension corresponding to the longitude, latitude and the different parameters. The input parameters for an univariate reconstruction are three time instances of temperature scaled by the inverse of the error variance (previous, current and next day), the corresponding inverse of the error variance, the longitude and latitude of every grid cell and the sine and cosine of the day of the year multiplied by <inline-formula><mml:math id="M49" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">π</mml:mi><mml:mo>/</mml:mo><mml:mn mathvariant="normal">365.25</mml:mn></mml:mrow></mml:math></inline-formula> as in <xref ref-type="bibr" rid="bib1.bibx5" id="text.30"/>. The assembled array has a size of <inline-formula><mml:math id="M50" display="inline"><mml:mrow><mml:mn mathvariant="normal">168</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">144</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">10</mml:mn></mml:mrow></mml:math></inline-formula> elements (for a single training sample). The input is processed by the encoder which is composed of five convolutional layers (with 16, 30, 58, 110 and 209 output filters) with a kernel size of <inline-formula><mml:math id="M51" display="inline"><mml:mrow><mml:mn mathvariant="normal">3</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula> and a rectified linear unit (RELU) activation function followed by a max pooling layer with a size of <inline-formula><mml:math id="M52" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula>.
The RELU activation function is commonly used in neural networks and is defined by <inline-formula><mml:math id="M53" display="inline"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mo>max⁡</mml:mo><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d1e1558">The output of the encoder is transformed back to a full image by the decoder, which mirrors the structure of the encoder. The decoder is composed of five upsampling layers (nearest-neighborhood interpolation or bilinear interpolation) followed by a convolutional layer with the equivalent number output filters from the encoder (except for the final layer, which has only two outputs related to the reconstruction and its error variance). The final layer produces a 3D-array <inline-formula><mml:math id="M54" display="inline"><mml:mrow><mml:msubsup><mml:mi>T</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi><mml:mi>k</mml:mi></mml:mrow><mml:mi mathvariant="normal">out</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> (size <inline-formula><mml:math id="M55" display="inline"><mml:mrow><mml:mn mathvariant="normal">168</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">144</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula>) from which the reconstruction  <inline-formula><mml:math id="M56" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> and its error variance <inline-formula><mml:math id="M57" display="inline"><mml:mrow><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup></mml:mrow></mml:math></inline-formula> are computed by

              <disp-formula specific-use="align" content-type="numbered"><mml:math id="M58" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E6"><mml:mtd><mml:mtext>6</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mo>max⁡</mml:mo><mml:mo>(</mml:mo><mml:mi>exp⁡</mml:mi><mml:mo>(</mml:mo><mml:mo>min⁡</mml:mo><mml:mo>(</mml:mo><mml:msubsup><mml:mi>T</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">out</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mi mathvariant="italic">γ</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="italic">μ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E7"><mml:mtd><mml:mtext>7</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:msubsup><mml:mi>T</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">out</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msubsup><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="italic">σ</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

          where the parameters <inline-formula><mml:math id="M59" display="inline"><mml:mi mathvariant="italic">γ</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M60" display="inline"><mml:mi mathvariant="italic">μ</mml:mi></mml:math></inline-formula> are set to <inline-formula><mml:math id="M61" display="inline"><mml:mn mathvariant="normal">10</mml:mn></mml:math></inline-formula> and <inline-formula><mml:math id="M62" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, respectively. The minimum and maximum functions (<inline-formula><mml:math id="M63" display="inline"><mml:mo lspace="0mm">min⁡</mml:mo></mml:math></inline-formula> and <inline-formula><mml:math id="M64" display="inline"><mml:mo>max⁡</mml:mo></mml:math></inline-formula>) are introduced here to prevent division by a value close to zero or the exponentiation of a too-large value. This stabilizes the neural network during the initial phase of the training as the weights and biases are randomly initialized.</p>
      <p id="d1e1818">In <xref ref-type="bibr" rid="bib1.bibx5" id="text.31"/>, after the convolutional layers, the model included two fully connected layers (with drop-out). This is no longer used as such layers require that the input matrix for training has exactly the same size as the input matrix of the reconstruction (inference), which makes this architecture difficult for large input arrays (which would arise for global or basin-wide sea surface temperature fields for example).
The benefit of replacing dense layers by convolutional layers is further discussed in <xref ref-type="bibr" rid="bib1.bibx18" id="text.32"/>.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T1" specific-use="star"><?xmltex \currentcnt{1}?><label>Table 1</label><caption><p id="d1e1831">Layers of the neural network for the gridded datasets. Note that every convolution is followed by a RELU activation function.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="left"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Number</oasis:entry>
         <oasis:entry colname="col2">Type</oasis:entry>
         <oasis:entry colname="col3">Output size</oasis:entry>
         <oasis:entry colname="col4">Parameters of the layer</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">1</oasis:entry>
         <oasis:entry colname="col2">input</oasis:entry>
         <oasis:entry colname="col3">168 <inline-formula><mml:math id="M65" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 144 <inline-formula><mml:math id="M66" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 10</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">2</oasis:entry>
         <oasis:entry colname="col2">conv. 2d</oasis:entry>
         <oasis:entry colname="col3">168 <inline-formula><mml:math id="M67" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 114 <inline-formula><mml:math id="M68" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 16</oasis:entry>
         <oasis:entry colname="col4">n. filters <inline-formula><mml:math id="M69" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 16, kernel size <inline-formula><mml:math id="M70" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (3,3)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">3</oasis:entry>
         <oasis:entry colname="col2">pooling 2d</oasis:entry>
         <oasis:entry colname="col3">84 <inline-formula><mml:math id="M71" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 72 <inline-formula><mml:math id="M72" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 16</oasis:entry>
         <oasis:entry colname="col4">pool size <inline-formula><mml:math id="M73" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (2,2), strides <inline-formula><mml:math id="M74" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (2,2)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">4</oasis:entry>
         <oasis:entry colname="col2">conv. 2d</oasis:entry>
         <oasis:entry colname="col3">84 <inline-formula><mml:math id="M75" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 72 <inline-formula><mml:math id="M76" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 30</oasis:entry>
         <oasis:entry colname="col4">n. filters <inline-formula><mml:math id="M77" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 30, kernel size <inline-formula><mml:math id="M78" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (3,3)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">5</oasis:entry>
         <oasis:entry colname="col2">pooling 2d</oasis:entry>
         <oasis:entry colname="col3">42 <inline-formula><mml:math id="M79" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 36 <inline-formula><mml:math id="M80" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 30</oasis:entry>
         <oasis:entry colname="col4">pool size <inline-formula><mml:math id="M81" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (2,2), strides <inline-formula><mml:math id="M82" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (2,2)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">6</oasis:entry>
         <oasis:entry colname="col2">conv. 2d</oasis:entry>
         <oasis:entry colname="col3">42 <inline-formula><mml:math id="M83" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 36 <inline-formula><mml:math id="M84" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 58</oasis:entry>
         <oasis:entry colname="col4">n. filters <inline-formula><mml:math id="M85" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 58, kernel size <inline-formula><mml:math id="M86" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (3,3)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">7</oasis:entry>
         <oasis:entry colname="col2">pooling 2d</oasis:entry>
         <oasis:entry colname="col3">21 <inline-formula><mml:math id="M87" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 18 <inline-formula><mml:math id="M88" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 58</oasis:entry>
         <oasis:entry colname="col4">pool size <inline-formula><mml:math id="M89" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (2,2), strides <inline-formula><mml:math id="M90" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (2,2)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">8</oasis:entry>
         <oasis:entry colname="col2">conv. 2d</oasis:entry>
         <oasis:entry colname="col3">21 <inline-formula><mml:math id="M91" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 18 <inline-formula><mml:math id="M92" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 110</oasis:entry>
         <oasis:entry colname="col4">n. filters <inline-formula><mml:math id="M93" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 110, kernel size <inline-formula><mml:math id="M94" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (3,3)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">9</oasis:entry>
         <oasis:entry colname="col2">pooling 2d</oasis:entry>
         <oasis:entry colname="col3">11 <inline-formula><mml:math id="M95" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 9 <inline-formula><mml:math id="M96" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 110</oasis:entry>
         <oasis:entry colname="col4">pool size <inline-formula><mml:math id="M97" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (2,2), strides <inline-formula><mml:math id="M98" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (2,2)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">10</oasis:entry>
         <oasis:entry colname="col2">conv. 2d</oasis:entry>
         <oasis:entry colname="col3">11 <inline-formula><mml:math id="M99" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 9 <inline-formula><mml:math id="M100" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 209</oasis:entry>
         <oasis:entry colname="col4">n. filters <inline-formula><mml:math id="M101" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 209, kernel size <inline-formula><mml:math id="M102" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (3,3)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">11</oasis:entry>
         <oasis:entry colname="col2">pooling 2d</oasis:entry>
         <oasis:entry colname="col3">6 <inline-formula><mml:math id="M103" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 5 <inline-formula><mml:math id="M104" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 209</oasis:entry>
         <oasis:entry colname="col4">pool size <inline-formula><mml:math id="M105" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (2,2), strides <inline-formula><mml:math id="M106" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (2,2)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">12</oasis:entry>
         <oasis:entry colname="col2">interpolation</oasis:entry>
         <oasis:entry colname="col3">11 <inline-formula><mml:math id="M107" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 9 <inline-formula><mml:math id="M108" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 209</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">13</oasis:entry>
         <oasis:entry colname="col2">conv. 2d</oasis:entry>
         <oasis:entry colname="col3">11 <inline-formula><mml:math id="M109" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 9 <inline-formula><mml:math id="M110" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 110</oasis:entry>
         <oasis:entry colname="col4">n. filters <inline-formula><mml:math id="M111" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 110, kernel size <inline-formula><mml:math id="M112" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (3,3)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">14</oasis:entry>
         <oasis:entry colname="col2">sum output of 13 and 9</oasis:entry>
         <oasis:entry colname="col3">11 <inline-formula><mml:math id="M113" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 9 <inline-formula><mml:math id="M114" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 110</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">15</oasis:entry>
         <oasis:entry colname="col2">interpolation</oasis:entry>
         <oasis:entry colname="col3">21 <inline-formula><mml:math id="M115" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 18 <inline-formula><mml:math id="M116" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 110</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">16</oasis:entry>
         <oasis:entry colname="col2">conv. 2d</oasis:entry>
         <oasis:entry colname="col3">21 <inline-formula><mml:math id="M117" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 18 <inline-formula><mml:math id="M118" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 58</oasis:entry>
         <oasis:entry colname="col4">n. filters <inline-formula><mml:math id="M119" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 58, kernel size <inline-formula><mml:math id="M120" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (3,3)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">17</oasis:entry>
         <oasis:entry colname="col2">sum output of 16 and 9</oasis:entry>
         <oasis:entry colname="col3">21 <inline-formula><mml:math id="M121" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 18 <inline-formula><mml:math id="M122" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 58</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">18</oasis:entry>
         <oasis:entry colname="col2">interpolation</oasis:entry>
         <oasis:entry colname="col3">42 <inline-formula><mml:math id="M123" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 36 <inline-formula><mml:math id="M124" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 58</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">19</oasis:entry>
         <oasis:entry colname="col2">conv. 2d</oasis:entry>
         <oasis:entry colname="col3">42 <inline-formula><mml:math id="M125" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 36 <inline-formula><mml:math id="M126" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 30</oasis:entry>
         <oasis:entry colname="col4">n. filters <inline-formula><mml:math id="M127" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 30, kernel size <inline-formula><mml:math id="M128" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (3,3)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">20</oasis:entry>
         <oasis:entry colname="col2">sum output of 19 and 5</oasis:entry>
         <oasis:entry colname="col3">42 <inline-formula><mml:math id="M129" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 36 <inline-formula><mml:math id="M130" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 30</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">21</oasis:entry>
         <oasis:entry colname="col2">interpolation</oasis:entry>
         <oasis:entry colname="col3">84 <inline-formula><mml:math id="M131" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 72 <inline-formula><mml:math id="M132" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 30</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">22</oasis:entry>
         <oasis:entry colname="col2">conv. 2d</oasis:entry>
         <oasis:entry colname="col3">84 <inline-formula><mml:math id="M133" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 72 <inline-formula><mml:math id="M134" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 16</oasis:entry>
         <oasis:entry colname="col4">n. filters <inline-formula><mml:math id="M135" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 16, kernel size <inline-formula><mml:math id="M136" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (3,3)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">23</oasis:entry>
         <oasis:entry colname="col2">sum output of 22 and 3</oasis:entry>
         <oasis:entry colname="col3">84 <inline-formula><mml:math id="M137" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 72 <inline-formula><mml:math id="M138" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 16</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">24</oasis:entry>
         <oasis:entry colname="col2">interpolation</oasis:entry>
         <oasis:entry colname="col3">168 <inline-formula><mml:math id="M139" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 144 <inline-formula><mml:math id="M140" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 16</oasis:entry>
         <oasis:entry colname="col4"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">25</oasis:entry>
         <oasis:entry colname="col2">conv. 2d</oasis:entry>
         <oasis:entry colname="col3">168 <inline-formula><mml:math id="M141" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 144 <inline-formula><mml:math id="M142" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 2</oasis:entry>
         <oasis:entry colname="col4">n. filters <inline-formula><mml:math id="M143" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 2, kernel size <inline-formula><mml:math id="M144" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (3,3)</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <?xmltex \floatpos{t}?><fig id="Ch1.F1"><?xmltex \currentcnt{1}?><?xmltex \def\figurename{Figure}?><label>Figure 1</label><caption><p id="d1e2800">General structure of the DINCAE with 2D convolution (conv), max pooling (pool) and interpolation layers (interp). All 2D convolutions are followed by a RELU activation function.</p></caption>
        <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f01.png"/>

      </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F2"><?xmltex \currentcnt{2}?><?xmltex \def\figurename{Figure}?><label>Figure 2</label><caption><p id="d1e2811">DINCAE with a refinement step composed essentially by two sequential autoencoders coupled such that the second autoencoder uses the output of the first and the input data.</p></caption>
        <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f02.png"/>

      </fig>

      <p id="d1e2820">The altimetry data were analyzed on a 0.25<inline-formula><mml:math id="M145" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> resolution grid covering the area from 7<inline-formula><mml:math id="M146" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> W and 37<inline-formula><mml:math id="M147" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> E and 30 to 46<inline-formula><mml:math id="M148" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> N. For this neural network implementation, the input size is 177 <inline-formula><mml:math id="M149" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 69. The resolution is progressively reduced to 89 <inline-formula><mml:math id="M150" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 35, 45 <inline-formula><mml:math id="M151" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 18 and 23 <inline-formula><mml:math id="M152" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 9 by convolutional layers followed by max pooling layers with 32, 64 and 96 convolutional filters, respectively. Skip connections are implemented after the second convolutional layer onwards.</p>
      <p id="d1e2888">During training, Gaussian noise with a standard deviation <inline-formula><mml:math id="M153" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">pos</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is added to the position of the measurements and every track of the current date has a certain probability <inline-formula><mml:math id="M154" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="normal">drop</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> to be withheld from the input of the neural network. The loss function is computed only on tracks from the current date.  This helps the neural network to learn how to spread the information spatially.  The neural network is optimized during 1000 epochs and the intermediate results are saved every 25 epochs.</p>
      <p id="d1e2914">The altimetry test case illustrates the results for a non-gridded dataset.
Sea surface altimetry is usually gridded with a method like optimal interpolation or variational analysis. The latter can also be seen as a special case of optimal interpolation. For the autoencoder, the following fields are used as inputs:</p>
      <p id="d1e2917"><list list-type="bullet">
          <list-item>

      <p id="d1e2922">longitude and latitude of the measurement;</p>
          </list-item>
          <list-item>

      <p id="d1e2928">day of the year (sine and cosine) of the measurement multiplied by <inline-formula><mml:math id="M155" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">π</mml:mi><mml:mo>/</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mi>y</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> where <inline-formula><mml:math id="M156" display="inline"><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mi>y</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is 365.25 (the length of a year in days);</p>
          </list-item>
          <list-item>

      <p id="d1e2962">all data within a given centered time window of length <inline-formula><mml:math id="M157" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mi mathvariant="normal">win</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. For instance if the time window of length <inline-formula><mml:math id="M158" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mi mathvariant="normal">win</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is 9, the data from the current day are used as well as the tracks from the 4 previous days and the 4 following days.</p>
          </list-item>
        </list></p>
      <p id="d1e2993">As in <xref ref-type="bibr" rid="bib1.bibx5" id="text.33"/>, instead of using the observations directly, the observations
are divided by their respective error variance and the inverse of the error variance is used as input. Due to this scaling, it follows that missing data results in a zero input value as it corresponds to a data point with an “infinitely” large error.</p>
      <p id="d1e2999">The training is done using mini-batches of size <inline-formula><mml:math id="M159" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi mathvariant="normal">batch</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. The weights and biases in the neural network are updated using the gradient of the loss function evaluated at a batch of <inline-formula><mml:math id="M160" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi mathvariant="normal">batch</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> time instances (chosen at random). Evaluating the gradient using a subset of the training data chosen at random introduces some stochastic fluctuation allowing the optimization procedure to escape a local minima.</p>
      <p id="d1e3024">All numerical experiments used the Adam optimizer <xref ref-type="bibr" rid="bib1.bibx15" id="paren.34"/> with the standard parameter for the exponential decay rate for the first moment <inline-formula><mml:math id="M161" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.9</mml:mn></mml:mrow></mml:math></inline-formula>, and for the second moment <inline-formula><mml:math id="M162" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.999</mml:mn></mml:mrow></mml:math></inline-formula>, and a regularization parameter <inline-formula><mml:math id="M163" display="inline"><mml:mrow><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">8</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>.
The learning rate <inline-formula><mml:math id="M164" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is computed for every <inline-formula><mml:math id="M165" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula>-th epoch as follows:
          <disp-formula id="Ch1.Ex3"><mml:math id="M166" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mspace linebreak="nobreak" width="0.25em"/><mml:msup><mml:mn mathvariant="normal">2</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">γ</mml:mi><mml:mi mathvariant="normal">decay</mml:mi></mml:msub><mml:mi>n</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
        where <inline-formula><mml:math id="M167" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> is the initial learning rate and <inline-formula><mml:math id="M168" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">γ</mml:mi><mml:mi mathvariant="normal">decay</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> controls the exponential decay of the learning rate: every <inline-formula><mml:math id="M169" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:msub><mml:mi mathvariant="italic">γ</mml:mi><mml:mi mathvariant="normal">decay</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> epochs, the learning rates is halved. If <inline-formula><mml:math id="M170" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">γ</mml:mi><mml:mi mathvariant="normal">decay</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is zero, then the learning rate is kept constant.</p>
      <p id="d1e3179">The batch size includes 32 time instances (all hyper-parameters are determined via Bayesian optimization as described further on). The learning rate for the Adam optimizer is 0.00058. The L2 regularization on the weights has been set to a <inline-formula><mml:math id="M171" display="inline"><mml:mi mathvariant="italic">β</mml:mi></mml:math></inline-formula> value of <inline-formula><mml:math id="M172" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>.
The upsampling method can either be nearest neighbor or bilinear interpolation. In our tests, nearest neighbor provided the lowest RMS error relative to the development dataset. The absolute value of the gradients is clipped to 5 in order to stabilize the training. Satellite tracks from <inline-formula><mml:math id="M173" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mi mathvariant="normal">win</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">27</mml:mn></mml:mrow></mml:math></inline-formula> time days (centered around the target time) are used to derive gridded altimetry data.</p>
      <p id="d1e3221">The hyper-parameters of the neural network mentioned previously have been determined by Bayesian optimization <xref ref-type="bibr" rid="bib1.bibx20 bib1.bibx14 bib1.bibx30" id="paren.35"/> by minimizing the RMS error relative to the development dataset. The “expected improvement acquisition function” <xref ref-type="bibr" rid="bib1.bibx20" id="paren.36"/> is used to determine which sequence of parameter values is to be evaluated.
Bayesian optimization as implemented by the Python package <italic>scikit-optimize</italic> <xref ref-type="bibr" rid="bib1.bibx12" id="paren.37"/> was used in these tests.</p>
</sec>
<sec id="Ch1.S5">
  <label>5</label><title>Results and discussion</title>
<sec id="Ch1.S5.SS1">
  <label>5.1</label><title>Gridded data</title>
      <p id="d1e3252">The new type of skip connection was first tested with the AVHRR Test case from the Ligurian Sea <xref ref-type="bibr" rid="bib1.bibx5" id="paren.38"/>.  The previous best result (in terms of RMS) was 0.3835 <inline-formula><mml:math id="M174" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C using the cat skip connection. With the new approach this RMS error is reduced to 0.3604 <inline-formula><mml:math id="M175" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C. The new type of skip connection makes the neural network more similar to residual networks, which have been shown to be highly successful for image recognition tasks and allow for training deep networks more easily <xref ref-type="bibr" rid="bib1.bibx11" id="paren.39"/>.</p>
      <p id="d1e3279">In Table <xref ref-type="table" rid="Ch1.T2"/> we present the test case for the Adriatic Sea with and without the skip connections and using the multivariate reconstruction. Besides the RMS error, this table also includes the 10 % and 90 % percentiles of the absolute value of the difference between the reconstructed data and the cross-validation data to provide a typical range of the error. In general, using chlorophyll <inline-formula><mml:math id="M176" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> (or the wind fields) together with SST improved the results only marginally. The improvements were more consistent by using again the sum skip connection instead of the cat skip connection (in particular in conjunction with the additional refinement step). The analysis in the following uses the result of the lowest RMS, namely the reconstruction with sum skip connections and refinement and with the considered variables (chlorophyll <inline-formula><mml:math id="M177" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula>, the wind speed, zonal wind component and meridional wind component) as auxiliary parameters.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T2" specific-use="star"><?xmltex \currentcnt{2}?><label>Table 2</label><caption><p id="d1e3301">RMS errors (in <inline-formula><mml:math id="M178" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C) relative to the test dataset for different configurations (chlor_a: chlorophyll <inline-formula><mml:math id="M179" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula>, wind_speed: the wind speed, uwnd: zonal wind component, vwnd: meridional wind component). The two numbers in parentheses correspond to the 10 % and 90 % percentiles of the absolute value of the error.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Auxiliary parameters</oasis:entry>
         <oasis:entry colname="col2">cat skip connections</oasis:entry>
         <oasis:entry colname="col3">sum skip connections</oasis:entry>
         <oasis:entry colname="col4">sum skip connections and refinement</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">none</oasis:entry>
         <oasis:entry colname="col2">0.66 (0.06–1.02)</oasis:entry>
         <oasis:entry colname="col3">0.60 (0.05–0.93)</oasis:entry>
         <oasis:entry colname="col4">0.55 (0.04–0.84)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">chlor_a</oasis:entry>
         <oasis:entry colname="col2">0.64 (0.06–1.00)</oasis:entry>
         <oasis:entry colname="col3">0.59 (0.05–0.92)</oasis:entry>
         <oasis:entry colname="col4">0.54 (0.04–0.82)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">chlor_a, wind_speed</oasis:entry>
         <oasis:entry colname="col2">0.65 (0.06–1.00)</oasis:entry>
         <oasis:entry colname="col3">0.58 (0.05–0.90)</oasis:entry>
         <oasis:entry colname="col4">0.54 (0.04–0.82)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">chlor_a, wind_speed, uwnd, vwnd</oasis:entry>
         <oasis:entry colname="col2">0.66 (0.06–1.03)</oasis:entry>
         <oasis:entry colname="col3">0.57 (0.05–0.88)</oasis:entry>
         <oasis:entry colname="col4">0.54 (0.05–0.82)</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d1e3416">When reconstructing sea surface temperature time series, it is often the case that for some days only very few data points are available. Figure <xref ref-type="fig" rid="Ch1.F3"/> illustrates such a case, where a quite clear image was almost entirely masked as missing.  DINCAE essentially produces an average SST (still using previous and next time frames) for the time of the year and a realistic spatial distribution of the expected error.  The few pixels that are available have a relatively low error as expected, but the overall error structure looks quite realistic as the expected error increases significantly in the coastal areas (where the variability is higher and where the original satellite data are expected to be noisier). The reconstructed image matches the original image large-scale patterns relatively well, but as expected some small-scale structures are not reconstructed by the neural network. For this particular image, the RMS error (between the reconstructed data and the withheld data for cross-validation) is 0.43 <inline-formula><mml:math id="M180" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C and the 10 % and 90 % percentiles of the absolute value of the error are 0.05  and 0.63 <inline-formula><mml:math id="M181" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C, respectively.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F3"><?xmltex \currentcnt{3}?><?xmltex \def\figurename{Figure}?><label>Figure 3</label><caption><p id="d1e3441"><bold>(a)</bold> The original MODIS SST; <bold>(b)</bold> MODIS SST with additional clouds for cross-validation; <bold>(c)</bold> the DINCAE reconstruction using the data from panel <bold>(b)</bold>; <bold>(d)</bold> the expected error standard deviation of the DINCAE reconstruction. All panel values are in <inline-formula><mml:math id="M182" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f03.png"/>

        </fig>

      <p id="d1e3473">The detection of cloud pixels in the MODIS dataset is generally good, but Fig. <xref ref-type="fig" rid="Ch1.F4"/> (for 17 September 2003) shows an example where some pixels were characterized as valid sea points while they are probably (at least partially) obscured by clouds, resulting in an unrealistically low sea surface temperature. For most analysis techniques derived from optimal interpolation, outliers like undetected clouds typically degrade the analyzed field in the vicinity of the spurious observations. The outlier also produced an artifact in the output of DINCAE, but it is interesting to note that in this case, the artifact did not spread spatially and the associated expected error has some elevated values indicating a potential issue at the location.
The RMS error for this time instance is at 0.45 <inline-formula><mml:math id="M183" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C, similar to the previously shown image. The typical range of the absolute value of the error is                  0.04–0.72 <inline-formula><mml:math id="M184" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C (10 % and 90 % percentiles).</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F4"><?xmltex \currentcnt{4}?><?xmltex \def\figurename{Figure}?><label>Figure 4</label><caption><p id="d1e3498"><bold>(a)</bold> The original MODIS SST; <bold>(b)</bold> MODIS SST with additional clouds for cross-validation; <bold>(c)</bold> the DINCAE reconstruction using the data from panel <bold>(b)</bold>; <bold>(d)</bold>  the expected error standard deviation of the DINCAE reconstruction. All panel values are in <inline-formula><mml:math id="M185" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f04.png"/>

        </fig>

      <p id="d1e3530">A problem with techniques like optimal interpolation, variational analysis and to some degree also DINEOF is that the reconstruction smoothes out some small-scale features present in the initial data. For optimal interpolation and variational analysis, this smoothing is explicitly induced by using a specific correlation length. In EOF-based methods, this is related to the truncation of the EOFs series. In DINCAE, the input data are also compressed by a series of convolution and max pooling layers, and some smoothing is also expected, as in Fig. <xref ref-type="fig" rid="Ch1.F4"/>. Figure <xref ref-type="fig" rid="Ch1.F5"/> shows an example where the initial data have almost no clouds and only few clouds are added for validation. The reconstructed field retains the filament and other small-scale structures.
For this image, the RMS error (between the reconstructed data and the withheld data for cross-validation) is 0.55 <inline-formula><mml:math id="M186" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C and its typical range (as defined earlier) is 0.05 to 0.77 <inline-formula><mml:math id="M187" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C. The degree of smoothing can be quantified by the RMS difference of the reconstructed data and the input data (which is not an independent validation metric). This RMS difference is 0.15 <inline-formula><mml:math id="M188" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C, which is relatively low given the typical variability in sea surface temperature in this region.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F5"><?xmltex \currentcnt{5}?><?xmltex \def\figurename{Figure}?><label>Figure 5</label><caption><p id="d1e3567"><bold>(a)</bold> The original MODIS SST; <bold>(b)</bold> MODIS SST with additional clouds for cross-validation; <bold>(c)</bold> the DINCAE reconstruction using the data from panel <bold>(b)</bold>; <bold>(d)</bold>  the expected error standard deviation of the DINCAE reconstruction. All panel values are in <inline-formula><mml:math id="M189" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f05.png"/>

        </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F6"><?xmltex \currentcnt{6}?><?xmltex \def\figurename{Figure}?><label>Figure 6</label><caption><p id="d1e3601">Mean square error skill score  of the monovariate reconstruction corresponding to DINCAE 1.0 and  the multivariate case (considering all variables) and with an additional refinement step.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f06.png"/>

        </fig>

      <p id="d1e3610">To assess the improvement spatially, the mean square skill score <inline-formula><mml:math id="M190" display="inline"><mml:mi>S</mml:mi></mml:math></inline-formula> is computed (Fig. <xref ref-type="fig" rid="Ch1.F6"/>) for every grid cell.
            <disp-formula id="Ch1.E8" content-type="numbered"><label>8</label><mml:math id="M191" display="block"><mml:mrow><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi mathvariant="normal">MSE</mml:mi><mml:mrow><mml:msub><mml:mi mathvariant="normal">MSE</mml:mi><mml:mi mathvariant="normal">ref</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M192" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="normal">MSE</mml:mi><mml:mi mathvariant="normal">ref</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the mean square error of the monovariate reconstruction corresponding to DINCAE 1.0 relative to the validation dataset (per grid cell and averaging over time) and <inline-formula><mml:math id="M193" display="inline"><mml:mi mathvariant="normal">MSE</mml:mi></mml:math></inline-formula> is the mean square error of the multivariate case (considering all variables) and with an additional refinement step.
The improvement is spatially quite consistent. The mean square error is mostly reduced in the northern and central parts of the Adriatic. Only on some isolated grid cells is a degradation observed. The skill score reflects the combined improvement due to the three changes implemented in this version: updated skip connections, refined step and multivariate reconstruction.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T3" specific-use="star"><?xmltex \currentcnt{3}?><label>Table 3</label><caption><p id="d1e3669">Maximum standard deviation in three selected areas.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Obs. SD</oasis:entry>
         <oasis:entry colname="col3">DIVAnd SD</oasis:entry>
         <oasis:entry colname="col4">DINCAE SD</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">East Alboran Gyre</oasis:entry>
         <oasis:entry colname="col2">0.136</oasis:entry>
         <oasis:entry colname="col3">0.123</oasis:entry>
         <oasis:entry colname="col4">0.141</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Regions of the Alboran current</oasis:entry>
         <oasis:entry colname="col2">0.125</oasis:entry>
         <oasis:entry colname="col3">0.112</oasis:entry>
         <oasis:entry colname="col4">0.121</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Ierapetra Anticyclone</oasis:entry>
         <oasis:entry colname="col2">0.153</oasis:entry>
         <oasis:entry colname="col3">0.138</oasis:entry>
         <oasis:entry colname="col4">0.151</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <?xmltex \floatpos{p}?><fig id="Ch1.F7" specific-use="star"><?xmltex \currentcnt{7}?><?xmltex \def\figurename{Figure}?><label>Figure 7</label><caption><p id="d1e3754">Panel <bold>(a)</bold> Reconstructed SLA by DIVAnd, <bold>(b)</bold> expected error standard deviation by DIVAnd (with adjustment), <bold>(c)</bold> data used during training (partial) and <bold>(d)</bold> independent data for validation withheld during analysis. All panel values are in meters.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f07.png"/>

        </fig>

      <?xmltex \floatpos{p}?><fig id="Ch1.F8" specific-use="star"><?xmltex \currentcnt{8}?><?xmltex \def\figurename{Figure}?><label>Figure 8</label><caption><p id="d1e3777">Panel <bold>(a)</bold> Reconstructed SLA by DINCAE, <bold>(b)</bold> expected error standard deviation by DINCAE, <bold>(c)</bold> data used during training (partial) and <bold>(d)</bold> independent data for validation withheld during analysis. All panel values are in meters.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f08.png"/>

        </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F9" specific-use="star"><?xmltex \currentcnt{9}?><?xmltex \def\figurename{Figure}?><label>Figure 9</label><caption><p id="d1e3800"><bold>(a)</bold> Altimetry observation from the test data versus the reconstructed values from DIVAnd (using only the training data). <bold>(b)</bold> Expected standard deviation of the reconstruction (before and after adjustment) relative to the actual standard deviation of the reconstructed misfit.  The <inline-formula><mml:math id="M194" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M195" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> axes of both plots are expressed in meters.</p></caption>
          <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f09.png"/>

        </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F10" specific-use="star"><?xmltex \currentcnt{10}?><?xmltex \def\figurename{Figure}?><label>Figure 10</label><caption><p id="d1e3830"><bold>(a)</bold> Altimetry observation from the test data versus the reconstructed values from DINCAE with SST as auxiliary parameter (using only the training data). <bold>(b)</bold> Expected standard deviation of the reconstruction (before and after adjustment) relative to the actual standard deviation of the reconstructed misfit. The <inline-formula><mml:math id="M196" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M197" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> axes of both plots are expressed in meters.</p></caption>
          <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f10.png"/>

        </fig>

</sec>
<sec id="Ch1.S5.SS2">
  <label>5.2</label><title>Non-gridded data</title>
      <p id="d1e3866">The altimetry data is first gridded by the tool DIVAnd <xref ref-type="bibr" rid="bib1.bibx4" id="paren.40"/>. The main parameters here are the spatial correlation length (in km), the temporal correlation scale (days), the error variance of the observations (normalized by the background error variance) and the duration of the time window <inline-formula><mml:math id="M198" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mi mathvariant="normal">win</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> determining which observations are used to compute the reconstruction at the center of the time window.</p>
      <p id="d1e3885">All parameters of DIVAnd are also optimized using Bayesian minimization with an expected improvement in the acquisition function from minimizing the RMS error relative to the development datasets.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F11"><?xmltex \currentcnt{11}?><?xmltex \def\figurename{Figure}?><label>Figure 11</label><caption><p id="d1e3890">Standard deviation of the sea-level anomaly for the DIVAnd method and DINCAE (including SST as auxiliary parameter).</p></caption>
          <?xmltex \igopts{width=227.622047pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/15/2183/2022/gmd-15-2183-2022-f11.png"/>

        </fig>

      <p id="d1e3900">The best DIVAnd result is obtained with a horizontal correlation length of 74.8 km, a temporal correlation length of 5.5 d, a time window of 13 d and a normalized error variance of the observations of 20.5. An example reconstruction for the date 7 June 2017 is illustrated in Fig. <xref ref-type="fig" rid="Ch1.F7"/>. The parameters are determined by Bayesian optimization, minimizing the error relative to the development dataset.
The RMS error of the analysis for these parameters is 3.61 cm relative to the independent test dataset (Fig. <xref ref-type="fig" rid="Ch1.F9"/>).</p>
      <p id="d1e3907">The best performing neural network had a RMS error of 3.58 cm, which is only slightly better than results of DIVAnd (3.60 cm). When using the Mediterranean sea surface temperature as a co-variable we obtained a RMS error relative to the test dataset of 3.47 cm, resulting in a clearer advantage of the neural network approach.
The left panels of Figs. <xref ref-type="fig" rid="Ch1.F9"/> and <xref ref-type="fig" rid="Ch1.F10"/> show on the <inline-formula><mml:math id="M199" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M200" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> axes the observed altimetry (withheld during the analysis) and the corresponding reconstructed altimetry, respectively. The range of altimetry values from  <inline-formula><mml:math id="M201" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>20 to 30 cm was divided into 51 bins of 1 cm. The colors indicate the number of data points within each bin. For both reconstruction methods, the results scatter around the dashed line, which corresponds to the case where the reconstructed data correspond exactly to the observed altimetry. The eddy field of the DINCAE dataset is also quite similar to the one obtained from DIVAnd, but the anomalies of the structure are more pronounced in the DINCAE reconstruction (Fig. <xref ref-type="fig" rid="Ch1.F8"/>).</p>
      <p id="d1e3938">DINCAE and DIVAnd provide a field with the estimated expected error.
For DIVAnd we used the “clever poor man's method” as described in <xref ref-type="bibr" rid="bib1.bibx6" id="text.41"/>, and the background error variance is estimated by fitting an empirical covariance based on a random pair of points binned by their distance <xref ref-type="bibr" rid="bib1.bibx32 bib1.bibx33" id="paren.42"/>. The estimated error variance is later adjusted by a factor to account for uncertainties in estimating the background error variance.</p>
      <p id="d1e3947">We made 10 categories of pixels based on the expected standard deviation error, evenly distributed between the 10 % and 90 % percentiles of the expected standard deviation error. For every category, we computed the actual RMS relative to the test dataset. Ideally this should correspond to the estimated expected error of the reconstruction (including the observational error).  A global adjustment factor is also applied so that the average RMS error matches the mean expected error standard deviation, which is represented in the left panels of Figs. <xref ref-type="fig" rid="Ch1.F9"/> and <xref ref-type="fig" rid="Ch1.F10"/>. This adjustment factor is also applied to Fig. <xref ref-type="fig" rid="Ch1.F7"/>b.
The main advantage of DINCAE relative to DIVAnd is the improved estimate of the error variance of the results.</p>
      <p id="d1e3956">In summary, the accuracy of the DINCAE reconstruction is slightly better than the accuracy of the DIVAnd analysis. However, the main improvement of the DINCAE approach here is that the expected error variance of the analysis is much more reliable than the expected error variance of DIVAnd.</p>
      <p id="d1e3959">Figure <xref ref-type="fig" rid="Ch1.F11"/> shows the standard deviation of the sea-level anomaly computed over the whole time period. From this figure, three areas in particular stand out corresponding (from east to west) to the East Alboran Gyre, regions of the Algerian current and the Ierapetra Anticyclone (annotated with the black rectangle in Fig. <xref ref-type="fig" rid="Ch1.F11"/>). The maximum standard deviation (related to the surface transport variability) for these three areas is shown in Table <xref ref-type="table" rid="Ch1.T3"/>. The standard deviation is also computed from the satellite altimetry data considering all satellite observations falling within a given grid cell (excluding coastal grid cells with less than 10 observations). The standard deviation of the DINCAE reconstruction is in all three regions higher than the standard deviation for DIVAnd despite the DINCAE reconstruction having a lower RMS error than DIVAnd. In addition, the standard deviation of DINCAE is in general closer to the observed standard deviation.</p>
</sec>
</sec>
<sec id="Ch1.S6">
  <label>6</label><title>Conclusions</title>
      <p id="d1e3977">In this paper, we discussed improvements of the previous described DINCAE method. The code has been extended to handle multi-variate reconstructions, which were also described in <xref ref-type="bibr" rid="bib1.bibx10" id="text.43"/>. We also found that multivariate reconstruction can improve the reconstruction, but the largest improvement was obtained by changing the structure of the neural network by using a newly implemented different type of skip connection and refinement pass. Interestingly, this type bears some similarities to the hierarchical multigrid method for solving partial differential equations.
The handling of different types of satellite data was also improved. While most ocean satellite observations are gridded data (like sea surface temperature, ocean color, sea surface salinity and sea ice concentration), some parameters can only be inferred by nadir-looking satellites along tracks. For such non-gridded datasets, the first input layer is extended to handle such arbitrary location input data. We were able to show that the DINCAE method applied with altimetry data produces better reconstructions, but the main advantages are the significantly improved estimates of the expected reconstruction error variance. In this case, the DINCAE method compares favorably to the DIVAnd method (which is similar to optimal interpolation) in terms of reliability of the expected error variance, accuracy of the reconstruction relative to the test dataset and realism of the temporal standard deviation of the reconstruction assessed from the standard deviation of the observations.</p>
</sec>

      
      </body>
    <back><notes notes-type="codedataavailability"><title>Code and data availability</title>

      <p id="d1e3987">The source code is released as open source under the terms of the GNU General Public Licence v3 (or, at your option, any later version) and   available at the address <uri>https://github.com/gher-ulg/DINCAE.jl</uri> (last access: 9 March 2022) (<ext-link xlink:href="https://doi.org/10.5281/zenodo.6342276" ext-link-type="DOI">10.5281/zenodo.6342276</ext-link>, <xref ref-type="bibr" rid="bib1.bibx3" id="altparen.44"/>). The sea surface temperature (MODIS Terra Level 3 SST Thermal IR Daily 4km Nighttime v2014.0,  <ext-link xlink:href="https://doi.org/10.5067/MODST-1D4N4" ext-link-type="DOI">10.5067/MODST-1D4N4</ext-link>, <xref ref-type="bibr" rid="bib1.bibx24" id="altparen.45"/>) is available via PO.DAAC (<uri>https://podaac.jpl.nasa.gov/</uri>, last access: 10 February 2022, JPL, NASA, USA), and wind speed (Cross-Calibrated Multi-Platform, CCMP, gridded surface vector winds) is available from Remote Sensing Systems (<uri>http://www.remss.com/measurements/ccmp/</uri>, last access: 19 July 2019).
Chlorophyll <inline-formula><mml:math id="M202" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> from Ocean Biology Processing Group, NASA, can be accessed at
<uri>https://oceancolor.gsfc.nasa.gov/data/overview/</uri> (last access: 18 September 2019).
Altimetry data (dataset SEALEVEL_EUR_PHY_L3_MY_008_061) are made available by the <xref ref-type="bibr" rid="bib1.bibx8" id="text.46"/> (<ext-link xlink:href="https://doi.org/10.48670/moi-00139" ext-link-type="DOI">10.48670/moi-00139</ext-link>).
The L4 gridded SST over the Mediterranean is the
NOAA Optimum Interpolation <inline-formula><mml:math id="M203" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:math></inline-formula> Degree Daily Sea Surface Temperature (OISST) Analysis, Version 2 available at
<uri>https://www.ncei.noaa.gov/metadata/geoportal/rest/metadata/item/gov.noaa.ncdc:C00844/html</uri> (last access: 15 July 2020) (<ext-link xlink:href="https://doi.org/10.7289/V5SQ8XB5" ext-link-type="DOI">10.7289/V5SQ8XB5</ext-link>, <xref ref-type="bibr" rid="bib1.bibx28" id="altparen.47"/>).</p>
  </notes><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d1e4053">AB designed and implemented the neural network. AB, AAA, CT and JMB contributed to the planning and discussions and to the writing of the manuscript.</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d1e4059">The contact author has declared that neither they nor their co-authors have any competing interests.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d1e4065">Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.</p>
  </notes><ack><title>Acknowledgements</title><p id="d1e4071">The F.R.S.-FNRS (Fonds de la Recherche Scientifique de Belgique) is acknowledged for funding the position of Alexander Barth. This research was partly performed with funding from the Belgian Science Policy Office (BELSPO) STEREO III program in the framework of the MULTI-SYNC project (contract SR/00/359).
Computational resources have been provided in part by the Consortium des Équipements de Calcul Intensif (CÉCI), funded by the F.R.S.-FNRS under Grant No. 2.5020.11 and by the Walloon Region.
The authors also wish to thank the Julia community and in particular Deniz Yuret from Koç University (Istanbul, Turkey) for the <italic>Knet.jl</italic> package and Tim Besard (Julia Computing, Massachusetts, United States) for the <italic>CUDA.jl</italic> package, as well as the <uri>https://github.com/scikit-optimize/scikit-optimize/graphs/contributors</uri> (last access: 10 February 2022) developers of the python library <italic>scikit-optimize</italic>.
We thank the reviewers for their careful reading of the manuscript and their constructive and insightful comments.</p></ack><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d1e4088">This research has been supported by the Fonds De La Recherche Scientifique – FNRS (grant no. 4768341) and by the Belgian Science Policy Office (BELSPO) STEREO III program in the framework of the MULTI-SYNC project (contract SR/00/359).</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d1e4094">This paper was edited by Le Yu and reviewed by two anonymous referees.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><?xmltex \def\ref@label{{Alvera-Azc\'{a}rate et~al.(2007)Alvera-Azc\'{a}rate, Barth, Beckers,
and Weisberg}}?><label>Alvera-Azcárate et al.(2007)Alvera-Azcárate, Barth, Beckers,
and Weisberg</label><?label Alvera06d?><mixed-citation>Alvera-Azcárate, A., Barth, A., Beckers, J.-M., and Weisberg, R. H.:
Multivariate reconstruction of missing data in sea surface temperature,
chlorophyll and wind satellite field, J. Geophys. Res., 112,
C03008, <ext-link xlink:href="https://doi.org/10.1029/2006JC003660" ext-link-type="DOI">10.1029/2006JC003660</ext-link>,
2007.</mixed-citation></ref>
      <ref id="bib1.bibx2"><?xmltex \def\ref@label{{Atlas et~al.(2011)Atlas, Hoffman, Ardizzone, Leidner, Jusem, Smith,
and Gombos}}?><label>Atlas et al.(2011)Atlas, Hoffman, Ardizzone, Leidner, Jusem, Smith,
and Gombos</label><?label Atlas11?><mixed-citation>Atlas, R., Hoffman, R. N., Ardizzone, J., Leidner, S. M., Jusem, J. C., Smith,
D. K., and Gombos, D.: A cross-calibrated, multiplatform ocean surface wind
velocity product for meteorological and oceanographic applications, B. Am. Meteorol. Soc., 92, 157–174,
<ext-link xlink:href="https://doi.org/10.1175/2010BAMS2946.1" ext-link-type="DOI">10.1175/2010BAMS2946.1</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx3"><?xmltex \def\ref@label{Barth(2022)}?><label>Barth(2022)</label><?label barth?><mixed-citation>Barth, A.:  gher-ulg/DINCAE.jl: v2.0.0 (v2.0.0), Zenodo [code], <ext-link xlink:href="https://doi.org/10.5281/zenodo.6342276" ext-link-type="DOI">10.5281/zenodo.6342276</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx4"><?xmltex \def\ref@label{{Barth et~al.(2014)Barth, Beckers, Troupin, Alvera-Azc\'{a}rate, and
Vandenbulcke}}?><label>Barth et al.(2014)Barth, Beckers, Troupin, Alvera-Azcárate, and
Vandenbulcke</label><?label Barth14divand?><mixed-citation>Barth, A., Beckers, J.-M., Troupin, C., Alvera-Azcárate, A., and Vandenbulcke, L.: divand-1.0: <inline-formula><mml:math id="M204" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula>-dimensional variational data analysis for ocean observations, Geosci. Model Dev., 7, 225–241, <ext-link xlink:href="https://doi.org/10.5194/gmd-7-225-2014" ext-link-type="DOI">10.5194/gmd-7-225-2014</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx5"><?xmltex \def\ref@label{{Barth et~al.(2020)Barth, Alvera-Azc\'{a}rate, Licer, and
Beckers}}?><label>Barth et al.(2020)Barth, Alvera-Azcárate, Licer, and
Beckers</label><?label Barth2020?><mixed-citation>Barth, A., Alvera-Azcárate, A., Licer, M., and Beckers, J.-M.: DINCAE 1.0: a convolutional neural network with error estimates to reconstruct sea surface temperature satellite observations, Geosci. Model Dev., 13, 1609–1622, <ext-link xlink:href="https://doi.org/10.5194/gmd-13-1609-2020" ext-link-type="DOI">10.5194/gmd-13-1609-2020</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx6"><?xmltex \def\ref@label{{Beckers et~al.(2014)Beckers, Barth, Troupin, and
Alvera-Azc\'{a}rate}}?><label>Beckers et al.(2014)Beckers, Barth, Troupin, and
Alvera-Azcárate</label><?label Beckers14?><mixed-citation>Beckers, J.-M., Barth, A., Troupin, C., and Alvera-Azcárate, A.:
Approximate and efficient methods to assess error fields in spatial gridding
with data interpolating variational analysis (DIVA), J. Atmos.
Ocean. Technol., 31, 515–530, <ext-link xlink:href="https://doi.org/10.1175/JTECH-D-13-00130.1" ext-link-type="DOI">10.1175/JTECH-D-13-00130.1</ext-link>,
2014.</mixed-citation></ref>
      <ref id="bib1.bibx7"><?xmltex \def\ref@label{{Bezanson et~al.(2017)Bezanson, Edelman, Karpinski, and
Shah}}?><label>Bezanson et al.(2017)Bezanson, Edelman, Karpinski, and
Shah</label><?label bezanson2017julia?><mixed-citation>Bezanson, J., Edelman, A., Karpinski, S., and Shah, V. B.: Julia: A fresh
approach to numerical computing, SIAM Review, 59, 65–98,
<ext-link xlink:href="https://doi.org/10.1137/141000671" ext-link-type="DOI">10.1137/141000671</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx8"><?xmltex \def\ref@label{Copernicus Marine Environment Monitoring Service(2020)}?><label>Copernicus Marine Environment Monitoring Service(2020)</label><?label copernicus?><mixed-citation>Copernicus Marine Environment Monitoring Service, Collecte Localisation Satellites: European Seas along-track L3 sea surface heights reprocessed (1993–ongoing) tailored for data assimilation, Mercator Ocean International [data set], <ext-link xlink:href="https://doi.org/10.48670/moi-00139" ext-link-type="DOI">10.48670/moi-00139</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx9"><?xmltex \def\ref@label{{Gong et~al.(2019)Gong, Zhong, and Hu}}?><label>Gong et al.(2019)Gong, Zhong, and Hu</label><?label Gong2019?><mixed-citation>Gong, Z., Zhong, P., and Hu, W.: Diversity in Machine Learning, IEEE Access, 7,
64323–64350, <ext-link xlink:href="https://doi.org/10.1109/ACCESS.2019.2917620" ext-link-type="DOI">10.1109/ACCESS.2019.2917620</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx10"><?xmltex \def\ref@label{{Han et~al.(2020)Han, He, Liu, and Perrie}}?><label>Han et al.(2020)Han, He, Liu, and Perrie</label><?label han2020?><mixed-citation>Han, Z., He, Y., Liu, G., and Perrie, W.: Application of DINCAE to Reconstruct
the Gaps in Chlorophyll-a Satellite Observations in the South China Sea and
West Philippine Sea, Remote Sensing, 12, 480, <ext-link xlink:href="https://doi.org/10.3390/rs12030480" ext-link-type="DOI">10.3390/rs12030480</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx11"><?xmltex \def\ref@label{{He et~al.(2016)He, Zhang, Ren, and Sun}}?><label>He et al.(2016)He, Zhang, Ren, and Sun</label><?label he2016deep?><mixed-citation>He, K., Zhang, X., Ren, S., and Sun, J.: Deep Residual Learning for Image
Recognition, in: 2016 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR),  770–778, <ext-link xlink:href="https://doi.org/10.1109/CVPR.2016.90" ext-link-type="DOI">10.1109/CVPR.2016.90</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx12"><?xmltex \def\ref@label{{Head et~al.(2020)Head, Kumar, Nahrstaedt, Louppe, and
Shcherbatyi}}?><label>Head et al.(2020)Head, Kumar, Nahrstaedt, Louppe, and
Shcherbatyi</label><?label head_tim_2020?><mixed-citation>Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., and Shcherbatyi, I.:
scikit-optimize/scikit-optimize, Zenodo [code], <ext-link xlink:href="https://doi.org/10.5281/zenodo.4014775" ext-link-type="DOI">10.5281/zenodo.4014775</ext-link>, v0.8.1, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx13"><?xmltex \def\ref@label{{Hochreiter(1998)}}?><label>Hochreiter(1998)</label><?label Hochreiter98?><mixed-citation>Hochreiter, S.: The Vanishing Gradient Problem During Learning Recurrent Neural
Nets and Problem Solutions, Int. J. Uncertain. Fuzz., 6, 107–116, <ext-link xlink:href="https://doi.org/10.1142/S0218488598000094" ext-link-type="DOI">10.1142/S0218488598000094</ext-link>,
1998.</mixed-citation></ref>
      <ref id="bib1.bibx14"><?xmltex \def\ref@label{{Jones et~al.(1998)Jones, Schonlau, and Welch}}?><label>Jones et al.(1998)Jones, Schonlau, and Welch</label><?label Jones98?><mixed-citation>Jones, D. R., Schonlau, M., and Welch, W.: Efficient Global Optimization of
Expensive Black-Box Functions, J. Global Optim., 13, 455–492,
<ext-link xlink:href="https://doi.org/10.1023/A:1008306431147" ext-link-type="DOI">10.1023/A:1008306431147</ext-link>, 1998.</mixed-citation></ref>
      <ref id="bib1.bibx15"><?xmltex \def\ref@label{{Kingma and Ba(2014)}}?><label>Kingma and Ba(2014)</label><?label Kingma14?><mixed-citation>Kingma, D. P. and Ba, J.: Adam: A Method for Stochastic Optimization, CoRR,
abs/1412.6980,  arXiv [preprint], <ext-link xlink:href="https://arxiv.org/abs/1412.6980">arXiv:1412.6980</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx16"><?xmltex \def\ref@label{{Liang et~al.(2018)Liang, Mazloff, Rosso, Fang, and Yu}}?><label>Liang et al.(2018)Liang, Mazloff, Rosso, Fang, and Yu</label><?label Liang2018?><mixed-citation>Liang, Y.-C., Mazloff, M. R., Rosso, I., Fang, S.-W., and Yu, J.-Y.: A
Multivariate Empirical Orthogonal Function Method to Construct Nitrate Maps
in the Southern Ocean, J. Atmos. Ocean. Technol., 35,
1505–1519, <ext-link xlink:href="https://doi.org/10.1175/JTECH-D-18-0018.1" ext-link-type="DOI">10.1175/JTECH-D-18-0018.1</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx17"><?xmltex \def\ref@label{{Liu et~al.(2019)Liu, Jiang, Xiao, and Yang}}?><label>Liu et al.(2019)Liu, Jiang, Xiao, and Yang</label><?label Liu2019?><mixed-citation>Liu, H., Jiang, B., Xiao, Y., and Yang, C.: Coherent Semantic Attention for
Image Inpainting, in: 2019 IEEE/CVF International Conference on Computer
Vision (ICCV), 4169–4178, <ext-link xlink:href="https://doi.org/10.1109/ICCV.2019.00427" ext-link-type="DOI">10.1109/ICCV.2019.00427</ext-link>,
2019.</mixed-citation></ref>
      <ref id="bib1.bibx18"><?xmltex \def\ref@label{{Long et~al.(2015)Long, Shelhamer, and Darrell}}?><label>Long et al.(2015)Long, Shelhamer, and Darrell</label><?label long2015fully?><mixed-citation>Long, J., Shelhamer, E., and Darrell, T.: Fully convolutional networks for
semantic segmentation, in: 2015 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR),  3431–3440, <ext-link xlink:href="https://doi.org/10.1109/CVPR.2015.7298965" ext-link-type="DOI">10.1109/CVPR.2015.7298965</ext-link>,
2015.</mixed-citation></ref>
      <ref id="bib1.bibx19"><?xmltex \def\ref@label{{Mears et~al.(2019)Mears, Scott, Wentz, Ricciardulli, Leidner,
Hoffman, and Atlas}}?><label>Mears et al.(2019)Mears, Scott, Wentz, Ricciardulli, Leidner,
Hoffman, and Atlas</label><?label Mears2019?><mixed-citation>Mears, C. A., Scott, J., Wentz, F. J., Ricciardulli, L., Leidner, S. M.,
Hoffman, R., and Atlas, R.: A Near Real Time Version of the Cross Calibrated
Multiplatform (CCMP) Ocean Surface Wind Velocity Data Set, J. Geophys. Res.-Oceans,  124, 6997–7010, <ext-link xlink:href="https://doi.org/10.1029/2019JC015367" ext-link-type="DOI">10.1029/2019JC015367</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx20"><?xmltex \def\ref@label{{Mockus et~al.(1978)Mockus, Tiesis, and Zilinskas}}?><label>Mockus et al.(1978)Mockus, Tiesis, and Zilinskas</label><?label Mockus1978?><mixed-citation>
Mockus, J., Tiesis, V., and Zilinskas, A.: The Application of Bayesian Methods
for Seeking the Extremum, Towards Global Optimization, 2,   117–129, 1978.</mixed-citation></ref>
      <ref id="bib1.bibx21"><?xmltex \def\ref@label{{Nardelli and Santoleri(2005)}}?><label>Nardelli and Santoleri(2005)</label><?label BuongiornoNardelli05?><mixed-citation>Nardelli, B. B. and Santoleri, R.: Methods for the Reconstruction of Vertical
Profiles from Surface Data: Multivariate Analyses, Residual GEM, and Variable
Temporal Signals in the North Pacific Ocean, J. Atmos.
Ocean. Tech., 22, 1762–1781, <ext-link xlink:href="https://doi.org/10.1175/JTECH1792.1" ext-link-type="DOI">10.1175/JTECH1792.1</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx22"><?xmltex \def\ref@label{NASA Goddard Space Flight Center et al.(2018)}?><label>NASA Goddard Space Flight Center et al.(2018)</label><?label nasa?><mixed-citation>NASA Goddard Space Flight Center, Ocean Ecology Laboratory, and Ocean Biology Processing Group: Sea-viewing Wide Field-of-view Sensor (SeaWiFS) Ocean Color Data, NASA OB.DAAC [data set], <ext-link xlink:href="https://doi.org/10.5067/ORBVIEW-2/SEAWIFS/L2/OC/2018" ext-link-type="DOI">10.5067/ORBVIEW-2/SEAWIFS/L2/OC/2018</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx23"><?xmltex \def\ref@label{{{National Centers for Environmental Information}(2016)}}?><label>National Centers for Environmental Information(2016)</label><?label AVHRR_OI?><mixed-citation>National Centers for Environmental Information: Daily L4 Optimally
Interpolated SST (OISST) In situ and AVHRR Analysis. Ver. 2.0. PO.DAAC, CA,
USA [data set], <ext-link xlink:href="https://doi.org/10.5067/GHAAO-4BC02" ext-link-type="DOI">10.5067/GHAAO-4BC02</ext-link>,   2016.</mixed-citation></ref>
      <ref id="bib1.bibx24"><?xmltex \def\ref@label{OBPG(2015)}?><label>OBPG(2015)</label><?label OBPG?><mixed-citation>OBPG:   MODIS Terra Level 3 SST Thermal IR Daily 4km Nighttime v2014.0, Ver. 2014.0, PO.DAAC, CA, USA [data set], <ext-link xlink:href="https://doi.org/10.5067/MODST-1D4N4" ext-link-type="DOI">10.5067/MODST-1D4N4</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx25"><?xmltex \def\ref@label{{Olmedo et~al.(2018)Olmedo, Taupier-Letage, Turiel, and
Alvera-Azc\'{a}rate}}?><label>Olmedo et al.(2018)Olmedo, Taupier-Letage, Turiel, and
Alvera-Azcárate</label><?label olmedo2018?><mixed-citation>Olmedo, E., Taupier-Letage, I., Turiel, A., and Alvera-Azcárate, A.:
Improving SMOS Sea Surface Salinity in the Western Mediterranean Sea through
multivariate and multifractal analysis, Remote Sensing, 10, 485,
<ext-link xlink:href="https://doi.org/10.3390/rs10030485" ext-link-type="DOI">10.3390/rs10030485</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx26"><?xmltex \def\ref@label{{Olmedo et~al.(2021)Olmedo, Gonz\'{a}lez-Haro, Hoareau, Umbert,
Gonz\'{a}lez-Gambau, Mart\'{\i}nez, Gabarr\'{o}, and Turiel}}?><label>Olmedo et al.(2021)Olmedo, González-Haro, Hoareau, Umbert,
González-Gambau, Martínez, Gabarró, and Turiel</label><?label Olmedo21?><mixed-citation>Olmedo, E., González-Haro, C., Hoareau, N., Umbert, M., González-Gambau, V., Martínez, J., Gabarró, C., and Turiel, A.: Nine years of SMOS sea surface salinity global maps at the Barcelona Expert Center, Earth Syst. Sci. Data, 13, 857–888, <ext-link xlink:href="https://doi.org/10.5194/essd-13-857-2021" ext-link-type="DOI">10.5194/essd-13-857-2021</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx27"><?xmltex \def\ref@label{{Reynolds et~al.(2007)Reynolds, Smith, Liu, Chelton, Casey, and
Schlax}}?><label>Reynolds et al.(2007)Reynolds, Smith, Liu, Chelton, Casey, and
Schlax</label><?label Reynolds07?><mixed-citation>Reynolds, R. W., Smith, T. M., Liu, C., Chelton, D. B., Casey, K. S., and
Schlax, M. G.: Daily High-resolution Blended Analyses for sea surface
temperature, J. Climate, 20, 5473–5496,
<ext-link xlink:href="https://doi.org/10.1175/2007JCLI1824.1" ext-link-type="DOI">10.1175/2007JCLI1824.1</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx28"><?xmltex \def\ref@label{Reynolds et al.(2008)}?><label>Reynolds et al.(2008)</label><?label Reynolds?><mixed-citation>Reynolds, R. W.,  Banzon, V. F., and NOAA CDR Program: NOAA Optimum Interpolation 1/4 Degree Daily Sea Surface Temperature (OISST) Analysis, Version 2,  NOAA National Centers for Environmental Information [data set], <ext-link xlink:href="https://doi.org/10.7289/V5SQ8XB5" ext-link-type="DOI">10.7289/V5SQ8XB5</ext-link>, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx29"><?xmltex \def\ref@label{{Ronneberger et~al.(2015)Ronneberger, Fischer, and
Brox}}?><label>Ronneberger et al.(2015)Ronneberger, Fischer, and
Brox</label><?label Ronneberger2015?><mixed-citation>Ronneberger, O., Fischer, P., and Brox, T.: U-Net: Convolutional Networks for
Biomedical Image Segmentation, in: Medical Image Computing and
Computer-Assisted Intervention – MICCAI 2015, edited by: Navab, N.,
Hornegger, J., Wells, W. M., and Frangi, A. F.,  234–241, Springer
International Publishing, Cham, <ext-link xlink:href="https://doi.org/10.1007/978-3-319-24574-4_28" ext-link-type="DOI">10.1007/978-3-319-24574-4_28</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx30"><?xmltex \def\ref@label{{Snoek et~al.(2012)Snoek, Larochelle, and Adams}}?><label>Snoek et al.(2012)Snoek, Larochelle, and Adams</label><?label Snoek2012?><mixed-citation>
Snoek, J., Larochelle, H., and Adams, R. P.: Practical Bayesian optimization of
machine learning algorithms, in: Advances in Neural Information Processing
Systems,  2951–2959, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx31"><?xmltex \def\ref@label{{Szegedy et~al.(2015)Szegedy, Liu, Jia, Sermanet, Reed, Anguelov,
Erhan, Vanhoucke, and Rabinovich}}?><label>Szegedy et al.(2015)Szegedy, Liu, Jia, Sermanet, Reed, Anguelov,
Erhan, Vanhoucke, and Rabinovich</label><?label Szegedy2015?><mixed-citation>Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D.,
Vanhoucke, V., and Rabinovich, A.: Going Deeper with Convolutions, in:
Computer Vision and Pattern Recognition (CVPR),
arXiv [preprint], <ext-link xlink:href="https://arxiv.org/abs/1409.4842">arXiv:1409.4842</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx32"><?xmltex \def\ref@label{{Thiebaux(1986)}}?><label>Thiebaux(1986)</label><?label THIE76?><mixed-citation>Thiebaux, J.: Anisotropic correlation functions for Objective Analysis.,
Mon. Weather Rev., 104, 994–1002,
<ext-link xlink:href="https://doi.org/10.1175/1520-0493(1976)104&lt;0994:ACFFOA&gt;2.0.CO;2" ext-link-type="DOI">10.1175/1520-0493(1976)104&lt;0994:ACFFOA&gt;2.0.CO;2</ext-link>, 1986.</mixed-citation></ref>
      <ref id="bib1.bibx33"><?xmltex \def\ref@label{{Troupin et~al.(2012)Troupin, Barth, Sirjacobs, Ouberdous, Brankart,
Brasseur, Rixen, Alvera-Azc\'{a}rate, Belounis, Capet, Lenartz, Toussaint,
and Beckers}}?><label>Troupin et al.(2012)Troupin, Barth, Sirjacobs, Ouberdous, Brankart,
Brasseur, Rixen, Alvera-Azcárate, Belounis, Capet, Lenartz, Toussaint,
and Beckers</label><?label Troupin12diva?><mixed-citation>Troupin, C., Barth, A., Sirjacobs, D., Ouberdous, M., Brankart, J.-M.,
Brasseur, P., Rixen, M., Alvera-Azcárate, A., Belounis, M., Capet, A.,
Lenartz, F., Toussaint, M.-E., and Beckers, J.-M.: Generation of analysis
and consistent error fields using the Data Interpolating Variational Analysis
(DIVA), Ocean Modell., 52–53, 90–101,
<ext-link xlink:href="https://doi.org/10.1016/j.ocemod.2012.05.002" ext-link-type="DOI">10.1016/j.ocemod.2012.05.002</ext-link>,
2012.</mixed-citation></ref>
      <ref id="bib1.bibx34"><?xmltex \def\ref@label{{Wentz et~al.(2019)Wentz, Scott, Hoffman, Leidner, Atlas, and
Ardizzone}}?><label>Wentz et al.(2019)Wentz, Scott, Hoffman, Leidner, Atlas, and
Ardizzone</label><?label WentzCCMP?><mixed-citation>Wentz, F., Scott, J., Hoffman, R., Leidner, M., Atlas, R., and Ardizzone, J.:
Remote Sensing Systems Cross-Calibrated Multi-Platform (CCMP) 6-hourly ocean
vector wind analysis product on 0.25 deg grid, Version 2.0, Remote Sensing
Systems [data set], Santa Rosa, CA, <uri>http://www.remss.com/measurements/ccmp</uri>, last access: 19 July 2019.</mixed-citation></ref>
      <ref id="bib1.bibx35"><?xmltex \def\ref@label{{Wylie et~al.(2005)Wylie, Jackson, Menzel, and Bates}}?><label>Wylie et al.(2005)Wylie, Jackson, Menzel, and Bates</label><?label Wylie2005?><mixed-citation>Wylie, D., Jackson, D. L., Menzel, W. P., and Bates, J. J.: Trends in Global
Cloud Cover in Two Decades of HIRS Observations, J. Climate, 18,
3021–3031, <ext-link xlink:href="https://doi.org/10.1175/JCLI3461.1" ext-link-type="DOI">10.1175/JCLI3461.1</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx36"><?xmltex \def\ref@label{{Yuret(2016)}}?><label>Yuret(2016)</label><?label Yuret2016?><mixed-citation>Yuret, D.: Knet: beginning deep learning with 100 lines of Julia, 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, 2016.
 </mixed-citation></ref><?xmltex \hack{\newpage}?>
      <ref id="bib1.bibx37"><?xmltex \def\ref@label{{Zhang et~al.(2020)Zhang, Stanev, and Grayek}}?><label>Zhang et al.(2020)Zhang, Stanev, and Grayek</label><?label Zhang20?><mixed-citation>Zhang, Z., Stanev, E. V., and Grayek, S.: Reconstruction of the Basin-Wide
Sea-Level Variability in the North Sea Using Coastal Data and Generative
Adversarial Networks, J. Geophys. Res.-Oceans, 125,
e2020JC016402, <ext-link xlink:href="https://doi.org/10.1029/2020JC016402" ext-link-type="DOI">10.1029/2020JC016402</ext-link>,
2020.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>DINCAE 2.0: multivariate convolutional neural network with error estimates to reconstruct sea surface temperature satellite and altimetry observations</article-title-html>
<abstract-html/>
<ref-html id="bib1.bib1"><label>Alvera-Azcárate et al.(2007)Alvera-Azcárate, Barth, Beckers,
and Weisberg</label><mixed-citation>
Alvera-Azcárate, A., Barth, A., Beckers, J.-M., and Weisberg, R. H.:
Multivariate reconstruction of missing data in sea surface temperature,
chlorophyll and wind satellite field, J. Geophys. Res., 112,
C03008, <a href="https://doi.org/10.1029/2006JC003660" target="_blank">https://doi.org/10.1029/2006JC003660</a>,
2007.
</mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>Atlas et al.(2011)Atlas, Hoffman, Ardizzone, Leidner, Jusem, Smith,
and Gombos</label><mixed-citation>
Atlas, R., Hoffman, R. N., Ardizzone, J., Leidner, S. M., Jusem, J. C., Smith,
D. K., and Gombos, D.: A cross-calibrated, multiplatform ocean surface wind
velocity product for meteorological and oceanographic applications, B. Am. Meteorol. Soc., 92, 157–174,
<a href="https://doi.org/10.1175/2010BAMS2946.1" target="_blank">https://doi.org/10.1175/2010BAMS2946.1</a>, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>Barth(2022)</label><mixed-citation>
Barth, A.:  gher-ulg/DINCAE.jl: v2.0.0 (v2.0.0), Zenodo [code], <a href="https://doi.org/10.5281/zenodo.6342276" target="_blank">https://doi.org/10.5281/zenodo.6342276</a>, 2021.
</mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>Barth et al.(2014)Barth, Beckers, Troupin, Alvera-Azcárate, and
Vandenbulcke</label><mixed-citation>
Barth, A., Beckers, J.-M., Troupin, C., Alvera-Azcárate, A., and Vandenbulcke, L.: divand-1.0: <i>n</i>-dimensional variational data analysis for ocean observations, Geosci. Model Dev., 7, 225–241, <a href="https://doi.org/10.5194/gmd-7-225-2014" target="_blank">https://doi.org/10.5194/gmd-7-225-2014</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>Barth et al.(2020)Barth, Alvera-Azcárate, Licer, and
Beckers</label><mixed-citation>
Barth, A., Alvera-Azcárate, A., Licer, M., and Beckers, J.-M.: DINCAE 1.0: a convolutional neural network with error estimates to reconstruct sea surface temperature satellite observations, Geosci. Model Dev., 13, 1609–1622, <a href="https://doi.org/10.5194/gmd-13-1609-2020" target="_blank">https://doi.org/10.5194/gmd-13-1609-2020</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>Beckers et al.(2014)Beckers, Barth, Troupin, and
Alvera-Azcárate</label><mixed-citation>
Beckers, J.-M., Barth, A., Troupin, C., and Alvera-Azcárate, A.:
Approximate and efficient methods to assess error fields in spatial gridding
with data interpolating variational analysis (DIVA), J. Atmos.
Ocean. Technol., 31, 515–530, <a href="https://doi.org/10.1175/JTECH-D-13-00130.1" target="_blank">https://doi.org/10.1175/JTECH-D-13-00130.1</a>,
2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>Bezanson et al.(2017)Bezanson, Edelman, Karpinski, and
Shah</label><mixed-citation>
Bezanson, J., Edelman, A., Karpinski, S., and Shah, V. B.: Julia: A fresh
approach to numerical computing, SIAM Review, 59, 65–98,
<a href="https://doi.org/10.1137/141000671" target="_blank">https://doi.org/10.1137/141000671</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>Copernicus Marine Environment Monitoring Service(2020)</label><mixed-citation>
Copernicus Marine Environment Monitoring Service, Collecte Localisation Satellites: European Seas along-track L3 sea surface heights reprocessed (1993–ongoing) tailored for data assimilation, Mercator Ocean International [data set], <a href="https://doi.org/10.48670/moi-00139" target="_blank">https://doi.org/10.48670/moi-00139</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>Gong et al.(2019)Gong, Zhong, and Hu</label><mixed-citation>
Gong, Z., Zhong, P., and Hu, W.: Diversity in Machine Learning, IEEE Access, 7,
64323–64350, <a href="https://doi.org/10.1109/ACCESS.2019.2917620" target="_blank">https://doi.org/10.1109/ACCESS.2019.2917620</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>Han et al.(2020)Han, He, Liu, and Perrie</label><mixed-citation>
Han, Z., He, Y., Liu, G., and Perrie, W.: Application of DINCAE to Reconstruct
the Gaps in Chlorophyll-a Satellite Observations in the South China Sea and
West Philippine Sea, Remote Sensing, 12, 480, <a href="https://doi.org/10.3390/rs12030480" target="_blank">https://doi.org/10.3390/rs12030480</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>He et al.(2016)He, Zhang, Ren, and Sun</label><mixed-citation>
He, K., Zhang, X., Ren, S., and Sun, J.: Deep Residual Learning for Image
Recognition, in: 2016 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR),  770–778, <a href="https://doi.org/10.1109/CVPR.2016.90" target="_blank">https://doi.org/10.1109/CVPR.2016.90</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>Head et al.(2020)Head, Kumar, Nahrstaedt, Louppe, and
Shcherbatyi</label><mixed-citation>
Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., and Shcherbatyi, I.:
scikit-optimize/scikit-optimize, Zenodo [code], <a href="https://doi.org/10.5281/zenodo.4014775" target="_blank">https://doi.org/10.5281/zenodo.4014775</a>, v0.8.1, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>Hochreiter(1998)</label><mixed-citation>
Hochreiter, S.: The Vanishing Gradient Problem During Learning Recurrent Neural
Nets and Problem Solutions, Int. J. Uncertain. Fuzz., 6, 107–116, <a href="https://doi.org/10.1142/S0218488598000094" target="_blank">https://doi.org/10.1142/S0218488598000094</a>,
1998.
</mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>Jones et al.(1998)Jones, Schonlau, and Welch</label><mixed-citation>
Jones, D. R., Schonlau, M., and Welch, W.: Efficient Global Optimization of
Expensive Black-Box Functions, J. Global Optim., 13, 455–492,
<a href="https://doi.org/10.1023/A:1008306431147" target="_blank">https://doi.org/10.1023/A:1008306431147</a>, 1998.
</mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>Kingma and Ba(2014)</label><mixed-citation>
Kingma, D. P. and Ba, J.: Adam: A Method for Stochastic Optimization, CoRR,
abs/1412.6980,  arXiv [preprint], <a href="https://arxiv.org/abs/1412.6980" target="_blank">arXiv:1412.6980</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>Liang et al.(2018)Liang, Mazloff, Rosso, Fang, and Yu</label><mixed-citation>
Liang, Y.-C., Mazloff, M. R., Rosso, I., Fang, S.-W., and Yu, J.-Y.: A
Multivariate Empirical Orthogonal Function Method to Construct Nitrate Maps
in the Southern Ocean, J. Atmos. Ocean. Technol., 35,
1505–1519, <a href="https://doi.org/10.1175/JTECH-D-18-0018.1" target="_blank">https://doi.org/10.1175/JTECH-D-18-0018.1</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>Liu et al.(2019)Liu, Jiang, Xiao, and Yang</label><mixed-citation>
Liu, H., Jiang, B., Xiao, Y., and Yang, C.: Coherent Semantic Attention for
Image Inpainting, in: 2019 IEEE/CVF International Conference on Computer
Vision (ICCV), 4169–4178, <a href="https://doi.org/10.1109/ICCV.2019.00427" target="_blank">https://doi.org/10.1109/ICCV.2019.00427</a>,
2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>Long et al.(2015)Long, Shelhamer, and Darrell</label><mixed-citation>
Long, J., Shelhamer, E., and Darrell, T.: Fully convolutional networks for
semantic segmentation, in: 2015 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR),  3431–3440, <a href="https://doi.org/10.1109/CVPR.2015.7298965" target="_blank">https://doi.org/10.1109/CVPR.2015.7298965</a>,
2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>Mears et al.(2019)Mears, Scott, Wentz, Ricciardulli, Leidner,
Hoffman, and Atlas</label><mixed-citation>
Mears, C. A., Scott, J., Wentz, F. J., Ricciardulli, L., Leidner, S. M.,
Hoffman, R., and Atlas, R.: A Near Real Time Version of the Cross Calibrated
Multiplatform (CCMP) Ocean Surface Wind Velocity Data Set, J. Geophys. Res.-Oceans,  124, 6997–7010, <a href="https://doi.org/10.1029/2019JC015367" target="_blank">https://doi.org/10.1029/2019JC015367</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>Mockus et al.(1978)Mockus, Tiesis, and Zilinskas</label><mixed-citation>
Mockus, J., Tiesis, V., and Zilinskas, A.: The Application of Bayesian Methods
for Seeking the Extremum, Towards Global Optimization, 2,   117–129, 1978.
</mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>Nardelli and Santoleri(2005)</label><mixed-citation>
Nardelli, B. B. and Santoleri, R.: Methods for the Reconstruction of Vertical
Profiles from Surface Data: Multivariate Analyses, Residual GEM, and Variable
Temporal Signals in the North Pacific Ocean, J. Atmos.
Ocean. Tech., 22, 1762–1781, <a href="https://doi.org/10.1175/JTECH1792.1" target="_blank">https://doi.org/10.1175/JTECH1792.1</a>, 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>NASA Goddard Space Flight Center et al.(2018)</label><mixed-citation>
NASA Goddard Space Flight Center, Ocean Ecology Laboratory, and Ocean Biology Processing Group: Sea-viewing Wide Field-of-view Sensor (SeaWiFS) Ocean Color Data, NASA OB.DAAC [data set], <a href="https://doi.org/10.5067/ORBVIEW-2/SEAWIFS/L2/OC/2018" target="_blank">https://doi.org/10.5067/ORBVIEW-2/SEAWIFS/L2/OC/2018</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>National Centers for Environmental Information(2016)</label><mixed-citation>
National Centers for Environmental Information: Daily L4 Optimally
Interpolated SST (OISST) In situ and AVHRR Analysis. Ver. 2.0. PO.DAAC, CA,
USA [data set], <a href="https://doi.org/10.5067/GHAAO-4BC02" target="_blank">https://doi.org/10.5067/GHAAO-4BC02</a>,   2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>OBPG(2015)</label><mixed-citation>
OBPG:   MODIS Terra Level 3 SST Thermal IR Daily 4km Nighttime v2014.0, Ver. 2014.0, PO.DAAC, CA, USA [data set], <a href="https://doi.org/10.5067/MODST-1D4N4" target="_blank">https://doi.org/10.5067/MODST-1D4N4</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>Olmedo et al.(2018)Olmedo, Taupier-Letage, Turiel, and
Alvera-Azcárate</label><mixed-citation>
Olmedo, E., Taupier-Letage, I., Turiel, A., and Alvera-Azcárate, A.:
Improving SMOS Sea Surface Salinity in the Western Mediterranean Sea through
multivariate and multifractal analysis, Remote Sensing, 10, 485,
<a href="https://doi.org/10.3390/rs10030485" target="_blank">https://doi.org/10.3390/rs10030485</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>Olmedo et al.(2021)Olmedo, González-Haro, Hoareau, Umbert,
González-Gambau, Martínez, Gabarró, and Turiel</label><mixed-citation>
Olmedo, E., González-Haro, C., Hoareau, N., Umbert, M., González-Gambau, V., Martínez, J., Gabarró, C., and Turiel, A.: Nine years of SMOS sea surface salinity global maps at the Barcelona Expert Center, Earth Syst. Sci. Data, 13, 857–888, <a href="https://doi.org/10.5194/essd-13-857-2021" target="_blank">https://doi.org/10.5194/essd-13-857-2021</a>, 2021.
</mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>Reynolds et al.(2007)Reynolds, Smith, Liu, Chelton, Casey, and
Schlax</label><mixed-citation>
Reynolds, R. W., Smith, T. M., Liu, C., Chelton, D. B., Casey, K. S., and
Schlax, M. G.: Daily High-resolution Blended Analyses for sea surface
temperature, J. Climate, 20, 5473–5496,
<a href="https://doi.org/10.1175/2007JCLI1824.1" target="_blank">https://doi.org/10.1175/2007JCLI1824.1</a>, 2007.
</mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>Reynolds et al.(2008)</label><mixed-citation>
Reynolds, R. W.,  Banzon, V. F., and NOAA CDR Program: NOAA Optimum Interpolation 1/4 Degree Daily Sea Surface Temperature (OISST) Analysis, Version 2,  NOAA National Centers for Environmental Information [data set], <a href="https://doi.org/10.7289/V5SQ8XB5" target="_blank">https://doi.org/10.7289/V5SQ8XB5</a>, 2008.
</mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>Ronneberger et al.(2015)Ronneberger, Fischer, and
Brox</label><mixed-citation>
Ronneberger, O., Fischer, P., and Brox, T.: U-Net: Convolutional Networks for
Biomedical Image Segmentation, in: Medical Image Computing and
Computer-Assisted Intervention – MICCAI 2015, edited by: Navab, N.,
Hornegger, J., Wells, W. M., and Frangi, A. F.,  234–241, Springer
International Publishing, Cham, <a href="https://doi.org/10.1007/978-3-319-24574-4_28" target="_blank">https://doi.org/10.1007/978-3-319-24574-4_28</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>Snoek et al.(2012)Snoek, Larochelle, and Adams</label><mixed-citation>
Snoek, J., Larochelle, H., and Adams, R. P.: Practical Bayesian optimization of
machine learning algorithms, in: Advances in Neural Information Processing
Systems,  2951–2959, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>Szegedy et al.(2015)Szegedy, Liu, Jia, Sermanet, Reed, Anguelov,
Erhan, Vanhoucke, and Rabinovich</label><mixed-citation>
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D.,
Vanhoucke, V., and Rabinovich, A.: Going Deeper with Convolutions, in:
Computer Vision and Pattern Recognition (CVPR),
arXiv [preprint], <a href="https://arxiv.org/abs/1409.4842" target="_blank">arXiv:1409.4842</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>Thiebaux(1986)</label><mixed-citation>
Thiebaux, J.: Anisotropic correlation functions for Objective Analysis.,
Mon. Weather Rev., 104, 994–1002,
<a href="https://doi.org/10.1175/1520-0493(1976)104&lt;0994:ACFFOA&gt;2.0.CO;2" target="_blank">https://doi.org/10.1175/1520-0493(1976)104&lt;0994:ACFFOA&gt;2.0.CO;2</a>, 1986.
</mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>Troupin et al.(2012)Troupin, Barth, Sirjacobs, Ouberdous, Brankart,
Brasseur, Rixen, Alvera-Azcárate, Belounis, Capet, Lenartz, Toussaint,
and Beckers</label><mixed-citation>
Troupin, C., Barth, A., Sirjacobs, D., Ouberdous, M., Brankart, J.-M.,
Brasseur, P., Rixen, M., Alvera-Azcárate, A., Belounis, M., Capet, A.,
Lenartz, F., Toussaint, M.-E., and Beckers, J.-M.: Generation of analysis
and consistent error fields using the Data Interpolating Variational Analysis
(DIVA), Ocean Modell., 52–53, 90–101,
<a href="https://doi.org/10.1016/j.ocemod.2012.05.002" target="_blank">https://doi.org/10.1016/j.ocemod.2012.05.002</a>,
2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>Wentz et al.(2019)Wentz, Scott, Hoffman, Leidner, Atlas, and
Ardizzone</label><mixed-citation>
Wentz, F., Scott, J., Hoffman, R., Leidner, M., Atlas, R., and Ardizzone, J.:
Remote Sensing Systems Cross-Calibrated Multi-Platform (CCMP) 6-hourly ocean
vector wind analysis product on 0.25 deg grid, Version 2.0, Remote Sensing
Systems [data set], Santa Rosa, CA, <a href="http://www.remss.com/measurements/ccmp" target="_blank"/>, last access: 19 July 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>Wylie et al.(2005)Wylie, Jackson, Menzel, and Bates</label><mixed-citation>
Wylie, D., Jackson, D. L., Menzel, W. P., and Bates, J. J.: Trends in Global
Cloud Cover in Two Decades of HIRS Observations, J. Climate, 18,
3021–3031, <a href="https://doi.org/10.1175/JCLI3461.1" target="_blank">https://doi.org/10.1175/JCLI3461.1</a>, 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>Yuret(2016)</label><mixed-citation>
Yuret, D.: Knet: beginning deep learning with 100 lines of Julia, 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, 2016.

</mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>Zhang et al.(2020)Zhang, Stanev, and Grayek</label><mixed-citation>
Zhang, Z., Stanev, E. V., and Grayek, S.: Reconstruction of the Basin-Wide
Sea-Level Variability in the North Sea Using Coastal Data and Generative
Adversarial Networks, J. Geophys. Res.-Oceans, 125,
e2020JC016402, <a href="https://doi.org/10.1029/2020JC016402" target="_blank">https://doi.org/10.1029/2020JC016402</a>,
2020.
</mixed-citation></ref-html>--></article>
