<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">GMD</journal-id><journal-title-group>
    <journal-title>Geoscientific Model Development</journal-title>
    <abbrev-journal-title abbrev-type="publisher">GMD</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Geosci. Model Dev.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1991-9603</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/gmd-13-4253-2020</article-id><title-group><article-title>ML-SWAN-v1: a hybrid machine learning framework for the concentration prediction and
discovery of transport pathways of surface water nutrients</article-title><alt-title>ML-SWAN-v1</alt-title>
      </title-group><?xmltex \runningtitle{ML-SWAN-v1}?><?xmltex \runningauthor{B.~Wang et al.}?>
      <contrib-group>
        <contrib contrib-type="author" corresp="no" rid="aff1 aff2">
          <name><surname>Wang</surname><given-names>Benya</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-4165-6228</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2 aff3">
          <name><surname>Hipsey</surname><given-names>Matthew R.</given-names></name>
          
        <ext-link>https://orcid.org/0000-0001-8386-4354</ext-link></contrib>
        <contrib contrib-type="author" corresp="yes" rid="aff1 aff2">
          <name><surname>Oldham</surname><given-names>Carolyn</given-names></name>
          <email>carolyn.oldham@uwa.edu.au</email>
        </contrib>
        <aff id="aff1"><label>1</label><institution>Department of Civil, Mining and Environmental Engineering, The
University of Western Australia, <?xmltex \hack{\break}?>35 Stirling Highway, Crawley 6009,
Australia</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>Co-operative Research Centre for Water Sensitive Cities, Clayton,
Australia</institution>
        </aff>
        <aff id="aff3"><label>3</label><institution>UWA School of Agriculture and Environment, The University of Western
Australia, <?xmltex \hack{\break}?>35 Stirling Highway, Crawley 6009, Australia</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Carolyn Oldham (carolyn.oldham@uwa.edu.au)</corresp></author-notes><pub-date><day>15</day><month>September</month><year>2020</year></pub-date>
      
      <volume>13</volume>
      <issue>9</issue>
      <fpage>4253</fpage><lpage>4270</lpage>
      <history>
        <date date-type="received"><day>8</day><month>January</month><year>2020</year></date>
           <date date-type="accepted"><day>23</day><month>July</month><year>2020</year></date>
           <date date-type="rev-recd"><day>6</day><month>July</month><year>2020</year></date>
           <date date-type="rev-request"><day>6</day><month>April</month><year>2020</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2020 Benya Wang et al.</copyright-statement>
        <copyright-year>2020</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020.html">This article is available from https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020.html</self-uri><self-uri xlink:href="https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020.pdf">The full text article is available as a PDF file from https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020.pdf</self-uri>
      <abstract><title>Abstract</title>
    <p id="d1e115">Nutrient data from catchments discharging to receiving waters are monitored for catchment
management. However, nutrient data are often sparse in time and space and have non-linear
responses to environmental factors, making it difficult to systematically analyse long- and
short-term trends and undertake nutrient budgets. To address these challenges, we developed a
hybrid machine learning (ML) framework that first separated baseflow and quickflow from total
flow, generated data for missing nutrient species, and then utilised the pre-generated nutrient
data as additional variables in a final simulation of tributary water quality.  Hybrid random
forest (RF) and gradient boosting machine (GBM) models were employed and their performance
compared with a linear model, a multivariate weighted regression model, and stand-alone RF and GBM
models that did not pre-generate nutrient data. The six models were used to predict six different
nutrients discharged from two study sites in Western Australia: Ellen Brook (small and ephemeral)
and the Murray River (large and perennial). Our results showed that the hybrid RF and GBM models
had significantly higher accuracy and lower prediction uncertainty for almost all nutrient species
across the two sites. The pre-generated nutrient and hydrological data were highlighted as the
most important components of the hybrid model. The model results also indicated different
hydrological transport pathways for total nitrogen (TN) export from two tributary catchments. We demonstrated that
the hybrid model provides a flexible method to combine data of varied resolution and quality and
is accurate for the prediction of responses of surface water nutrient concentrations to hydrologic
variability.</p>
  </abstract>
    </article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d1e127">Surface water nutrient concentrations have been significantly increased by human activities (Forio
et al., 2015) due to urbanisation, waste discharges and agricultural intensification (Liu et al.,
2012; Kaiser et al., 2013; Li et al., 2013). Increased nutrient concentrations and loads in streams
alter the biogeochemical functioning and biological community structure in receiving estuaries
(Jickells et al., 2014; Staehr et al., 2017), leading to an increased incidence of harmful algal blooms
(Domingues et al., 2011), anoxia and hypoxia (Li et al., 2016; Testa et al., 2017) and reduced
water availability (Heathwaite, 2010). Analysis of tributary water quality data over time is
therefore essential to compute incoming nutrient loads, support policy and plan remediation
measures.</p>
      <p id="d1e130">Water quality data, however, often have constraints that make it challenging to analyse long- and
short-term trends. Firstly, water quality data often have non-linear responses to environmental
factors and show high-order interaction effects between different environmental variables. Moreover,
nutrients can derive from different sources (point or non-point) in the landscape and are
transported to receiving waters through different water pathways subject to varied<?pagebreak page4254?> catchment
hydrological conditions and human intervention (Hirsch et al., 2010; Lloyd et al.,
2014). Additionally, tributary nutrient datasets often are sparse in both space and time, due to the
high cost of fieldwork and chemical analysis (Lamsal et al., 2006; Forio et al., 2015). Historical
and current water quality monitoring programmes often use low-frequency sampling regimes on a weekly
to monthly basis (Halliday et al., 2012). When monthly averaged concentrations are used, calculated
nutrient loads to receiving environments such as lakes or estuaries may be poorly estimated (Cozzi
and Giani, 2011), with high variability in the estimated loads (Jordan and Cassidy, 2011). It is
also common to have patchy availability of nutrient species data across a study area, and combining
datasets from different projects and analytical laboratories makes the analysis of long-term trends
fraught with uncertainty. For instance, total nitrogen (TN) and total phosphorus (TP) concentrations
within catchment outflows may have been monitored for decades, while dissolved organic nitrogen
(DON) and dissolved organic carbon (DOC) concentrations may have only been monitored recently, with
the increasing recognition of their ecological importance (Górniak et al., 2002; Petrone et al.,
2009; Erlandsson et al., 2011). Given the hydrochemical correlation between different nutrient
species and high analytical cost, there are benefits in extracting maximum information from all
available nutrient data, particularly relating to changes in water quality over time (Hirsch et al.,
2010).  In summary, while high-quality nutrient data from tributaries are typically required as
input to water quality modelling of receiving waters, the reliability and accuracy of the trend analysis
of tributary data are frequently restricted by data non-linearity, limited sample size and variable
nutrient availability.</p>
      <p id="d1e133">Various models for constructing tributary water quality data have been developed. For example,
linear models (LMs) and generalised linear models (GLMs) that use correlations between concentration
(C) and flow (Q) have long played a central role in stream water quality analysis (Cohn et al.,
1989; Chanat et al., 2002). Some multivariate regression models have been applied to analyse the
long-term trend (Li et al., 2007; Tao et al., 2010; Greening et al., 2014) and seasonal patterns
(Giblin et al., 2010; Chen et al., 2012) of surface water nutrients. For example, a weighted
regression on time, discharge and season (WRTDS) was introduced by Hirsch et al. (2010) and has
been applied to a number of different water quality studies (Green et al., 2014; Zhang et al.,
2016a, b, c).</p>
      <p id="d1e136">Meanwhile, data-driven machine learning (ML) methods are increasingly being applied to quantify
relationships between soil, water and environmental landscape attributes (Lintern et al., 2018; Wang
et al., 2018; Guo et al., 2019). For instance, random forest (RF), a widely used ML method, was used
to model the spatial and seasonal variability of nitrate concentrations in streams
(Álvarez-Cabria et al., 2016).  Gradient boosting machines (GBMs) were used to quantify
relationships between land-use gradients and the structure and function of stream ecology (Clapcott
et al., 2012). In contrast to process-based conceptual models, ML methods simulate relationships
purely from the data (Maier et al., 2014) and have the ability to incorporate different types of
variables (e.g. numerical or categorised variables); this is particularly suitable for systems with
complex variable interactions and non-linear response functions (Povak et al., 2014).</p>
      <p id="d1e140">While both process-based and ML models can manage non-linear interactions and be used to explore
long-term trends, they both have difficulty in fully extracting important hydrochemical information
embedded in nutrient data.  Hybrid methods have been proposed for flow forecasting, to enhance the
performance of ML models by first using intermediate models to generate additional variables, which
are then used for subsequent modelling. For instance, a neural network model is first applied to
reconstruct surface ocean partial pressure of carbon dioxide (<inline-formula><mml:math id="M1" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula><inline-formula><mml:math id="M2" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">CO</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>) climatology, which is
used as an input into another neural network to predict <inline-formula><mml:math id="M3" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula><inline-formula><mml:math id="M4" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">CO</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> anomalies with other features
(Denvil-Sommer et al., 2019). Similarly, Noori and Kalin (2016) used the soil and water assessment
tool (SWAT) to generate baseflow and stormflow, which were then used as inputs to an artificial
neural network (ANN) model to improve daily flow prediction. Both studies used hybrid models to
demonstrate that pre-generated variables provided additional information that was crucial to
achieving higher prediction accuracy, compared with stand-alone ANN models.</p>
      <p id="d1e177">Stream flow integrates water from multiple pathways resulting in a distribution of residence
times. Stream nutrients are the product of overlapping historical inputs and reaction rates, which
are spatially distributed and temporally weighted within the catchment (Abbott et al.,
2016). Therefore, it is beneficial to understand nutrient transport pathways from the source to
receiving waters, to analyse the long- and short-term trends of stream nutrient data; this knowledge
will improve management strategies to reduce nutrient transport (Tesoriero et al., 2009; Mellander
et al., 2012). In the analysis of the streamflow hydrograph, separating baseflow (the long-term
delayed flow from storage) and quickflow (the short-term response to a rainfall event) from total
flow is a well-established strategy to better understand transport pathways (Tesoriero et al.,
2009). To utilise all available nutrient data and assess the impact of different transport pathways
on stream nutrient concentrations, we developed a hybrid machine learning framework for surface
water nutrient concentrations (ML-SWAN) that first separated baseflow and quickflow from total flow and then built intermediate models to generate missing nutrient species within the total nutrient pool,
using relationships with baseflow, quickflow, rainfall and seasonal components. The generated
nutrient data were included as additional variables for a final ML prediction. RF and GBM were
employed and their performance compared in stand-alone mode and as a hybrid method.</p>
      <p id="d1e180">This study aimed to compare model performance for nutrient concentration
prediction, to generate accurate daily<?pagebreak page4255?> nutrient data, to assess the impacts
of different water transport pathways on surface water nutrient
concentrations and to present a feasible framework for the application of
the hybrid method for surface water nutrient prediction. It was hypothesised
that the hybrid RF and hybrid GBM, which used pre-generated daily nutrient
concentrations and the separated baseflow and quickflow as additional
auxiliary inputs, would take advantage of the complementary strengths of
hydrochemical and hydrological relationships to provide the most accurate
and reliable nutrient predictions. To test this hypothesis, the hybrid RF
and hybrid GBM were compared to a linear model, a multivariate weighted
regression model (WRTDS), and stand-alone RF and GBM models, for the
prediction of TN, TP, <inline-formula><mml:math id="M5" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, DOC, DON, and filterable reactive phosphorus
(FRP) concentrations, at two different sites under varied hydrological
conditions.</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>Model overview</title>
      <p id="d1e202">Our modelling goal in this study was to minimise the sum of the overall loss
function between the predicted nutrient concentrations and measured nutrient
concentrations.
          <disp-formula id="Ch1.E1" content-type="numbered"><label>1</label><mml:math id="M6" display="block"><mml:mrow><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mi>i</mml:mi></mml:munder><mml:mi>L</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
        where <inline-formula><mml:math id="M7" display="inline"><mml:mi>L</mml:mi></mml:math></inline-formula> is a loss function (e.g. squared error), <inline-formula><mml:math id="M8" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are measured values, <inline-formula><mml:math id="M9" display="inline"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are
relevant variables, <inline-formula><mml:math id="M10" display="inline"><mml:mi>F</mml:mi></mml:math></inline-formula> is any approximation model, and <inline-formula><mml:math id="M11" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> or <inline-formula><mml:math id="M12" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the
model-predicted value at <inline-formula><mml:math id="M13" display="inline"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. The descriptions of different approximation models are described
in the following sections.</p>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Linear model and WRTDS model</title>
      <p id="d1e328">LMs are the most commonly used tool to describe concentration–discharge (<inline-formula><mml:math id="M14" display="inline"><mml:mi>C</mml:mi></mml:math></inline-formula>–<inline-formula><mml:math id="M15" display="inline"><mml:mi>Q</mml:mi></mml:math></inline-formula>)
relationships (Hirsch et al., 2010).  Typically, a log transformation is often applied to <inline-formula><mml:math id="M16" display="inline"><mml:mi>C</mml:mi></mml:math></inline-formula>
and <inline-formula><mml:math id="M17" display="inline"><mml:mi>Q</mml:mi></mml:math></inline-formula> data (Crowder et al., 2007; Meybeck and Moatar, 2012; Herndon et al., 2015), with the linear
model then described as
            <disp-formula id="Ch1.E2" content-type="numbered"><label>2</label><mml:math id="M18" display="block"><mml:mrow><mml:mi>log⁡</mml:mi><mml:mfenced close=")" open="("><mml:mi>C</mml:mi></mml:mfenced><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>log⁡</mml:mi><mml:mo>(</mml:mo><mml:mi>Q</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M19" display="inline"><mml:mi>C</mml:mi></mml:math></inline-formula> is nutrient concentration and <inline-formula><mml:math id="M20" display="inline"><mml:mi>Q</mml:mi></mml:math></inline-formula> is total flow. In this study, the linear model was used
as a benchmark for other models. The fitted slope <inline-formula><mml:math id="M21" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> can represent the base nutrient
concentration in a stream, while <inline-formula><mml:math id="M22" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> can describe relationships between hydrological and
biogeochemical data. The WRTDS model was also used (Hirsch et al., 2010) and can be described as

                <disp-formula specific-use="align" content-type="numbered"><mml:math id="M23" display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi>log⁡</mml:mi><mml:mfenced close=")" open="("><mml:mi>C</mml:mi></mml:mfenced><mml:mo>=</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>log⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mi>Q</mml:mi></mml:mfenced><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mtext>JD</mml:mtext><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msub><mml:mi>sin⁡</mml:mi><mml:mfenced close=")" open="("><mml:mtext>JD</mml:mtext></mml:mfenced></mml:mrow></mml:mtd></mml:mtr><mml:mlabeledtr id="Ch1.E3"><mml:mtd><mml:mtext>3</mml:mtext></mml:mtd><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:mfenced close=")" open="("><mml:mtext>JD</mml:mtext></mml:mfenced><mml:mo>+</mml:mo><mml:mi mathvariant="italic">ε</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

            where JD is the Julian day and <inline-formula><mml:math id="M24" display="inline"><mml:mi mathvariant="italic">ε</mml:mi></mml:math></inline-formula> is unexplained variation. <inline-formula><mml:math id="M25" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mtext>JD</mml:mtext></mml:mrow></mml:math></inline-formula> is
used to represent the long-term trend from year to year, while <inline-formula><mml:math id="M26" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:mo>(</mml:mo><mml:mtext>JD</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and
<inline-formula><mml:math id="M27" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub><mml:mi>sin⁡</mml:mi><mml:mo>(</mml:mo><mml:mtext>JD</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are used to describe the seasonal variation in stream nutrient
concentrations. To calculate the Julian Day for use in Eq. (3), the days since 1 January 1970 were first
calculated and then multiplied by <inline-formula><mml:math id="M28" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">π</mml:mi></mml:mrow></mml:math></inline-formula>. WRTDS advances the simpler linear model in two
aspects. Firstly, the additional components in the equation allow a consideration of seasonal and
long-term patterns and make the WRTDS model more able to describe stream nutrient concentrations
across the year. Secondly, unlike the linear model, whose parameters are constant in time, WRTDS
adjusts the parameters in a gradual manner throughout <inline-formula><mml:math id="M29" display="inline"><mml:mi>Q</mml:mi></mml:math></inline-formula>, JD space. This is accomplished by
applying a weighted regression for the estimation of <inline-formula><mml:math id="M30" display="inline"><mml:mrow><mml:mi>log⁡</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where the weights on each
observation are based on three distances between the observation (<inline-formula><mml:math id="M31" display="inline"><mml:mrow><mml:msub><mml:mi>Q</mml:mi><mml:mi>o</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M32" display="inline"><mml:mrow><mml:msub><mml:mtext>JD</mml:mtext><mml:mi>o</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>) and the
estimation point (<inline-formula><mml:math id="M33" display="inline"><mml:mrow><mml:msub><mml:mi>Q</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M34" display="inline"><mml:mrow><mml:msub><mml:mtext>JD</mml:mtext><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, which are (1) the time distance between <inline-formula><mml:math id="M35" display="inline"><mml:mrow><mml:msub><mml:mtext>JD</mml:mtext><mml:mi>o</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>
and <inline-formula><mml:math id="M36" display="inline"><mml:mrow><mml:msub><mml:mtext>JD</mml:mtext><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, (2) the seasonal distance between the time of year at <inline-formula><mml:math id="M37" display="inline"><mml:mrow><mml:msub><mml:mtext>JD</mml:mtext><mml:mi>o</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and the
time of year at <inline-formula><mml:math id="M38" display="inline"><mml:mrow><mml:msub><mml:mtext>JD</mml:mtext><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, and (3) the discharge distance between <inline-formula><mml:math id="M39" display="inline"><mml:mrow><mml:mi>log⁡</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>Q</mml:mi><mml:mi>o</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and
<inline-formula><mml:math id="M40" display="inline"><mml:mrow><mml:mi>log⁡</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>Q</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> (Hirsch et al., 2010; Green et al., 2014). Thus, <inline-formula><mml:math id="M41" display="inline"><mml:mrow><mml:mi>log⁡</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is considered to be locally
linearly related to <inline-formula><mml:math id="M42" display="inline"><mml:mrow><mml:mi>log⁡</mml:mi><mml:mo>(</mml:mo><mml:mi>Q</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, JD, <inline-formula><mml:math id="M43" display="inline"><mml:mrow><mml:mi>sin⁡</mml:mi><mml:mo>(</mml:mo><mml:mtext>JD</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M44" display="inline"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:mo>(</mml:mo><mml:mtext>JD</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>Random forest and gradient boosting machines</title>
      <p id="d1e799">RF and GBMs are ensemble models that combine multiple
base learners inside the model to improve the prediction performance (Ishwaran and Kogalur, 2010;
Singh et al., 2014). The ensemble methods are the main difference between RF and GBM. In RF,
bootstrap aggregating is used to resample the original dataset with replacement. Hence, datasets
with partial data are generated and then used to build individual base learners. Unlike bootstrap
aggregating, GBM iteratively generates a sequence of base learners, where each successive base
learner is built for the residual prediction of the preceding base learner (Friedman, 2001, 2002). The
probability with which data points are selected for the next training set is not constant and equal
for all data points. The selection probability increases for data points that have been
misestimated in the previous iteration; data points that are difficult to classify would receive
higher selection probabilities than easily classified data points (Yang et al., 2010; Erdal and
Karakurt, 2013).</p>
      <p id="d1e802">For RF and GBM, the most commonly used base learner is a classification and regression tree
(CART). A CART model is built to split the dataset into different nodes (Breiman et al., 1984): <inline-formula><mml:math id="M45" display="inline"><mml:mrow><mml:mfenced open="{" close="}"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>a</mml:mi></mml:msubsup><mml:mo>&lt;</mml:mo><mml:mi>v</mml:mi></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M46" display="inline"><mml:mrow><mml:mfenced open="{" close="}"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mi>j</mml:mi><mml:mi>a</mml:mi></mml:msubsup><mml:mo>≥</mml:mo><mml:mi>v</mml:mi></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula> for numeric
variables or <inline-formula><mml:math id="M47" display="inline"><mml:mrow><mml:mfenced close="}" open="{"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>d</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mi>c</mml:mi></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula> and
<inline-formula><mml:math id="M48" display="inline"><mml:mrow><mml:mfenced close="}" open="{"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mi>j</mml:mi><mml:mi>d</mml:mi></mml:msubsup><mml:mo>≠</mml:mo><mml:mi>c</mml:mi></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula> for categorised variables, where <inline-formula><mml:math id="M49" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M50" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula> are the
sample indices, <inline-formula><mml:math id="M51" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> is a numerical variable, <inline-formula><mml:math id="M52" display="inline"><mml:mi>v</mml:mi></mml:math></inline-formula> is one of the values of <inline-formula><mml:math id="M53" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> variable, <inline-formula><mml:math id="M54" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> is a
categorised variable, and <inline-formula><mml:math id="M55" display="inline"><mml:mi>c</mml:mi></mml:math></inline-formula> is one of the values of <inline-formula><mml:math id="M56" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> variable. To split the dataset at <inline-formula><mml:math id="M57" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> or
<inline-formula><mml:math id="M58" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula>, the sum of least-square error of the two nodes is calculated for a regression task as
            <disp-formula id="Ch1.E4" content-type="numbered"><label>4</label><mml:math id="M59" display="block"><mml:mrow><mml:mtext>error</mml:mtext><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow><mml:mi>L</mml:mi></mml:munderover><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>L</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>+</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>r</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow><mml:mi>R</mml:mi></mml:munderover><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>r</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>R</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M60" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M61" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>r</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are observations in two split nodes and <inline-formula><mml:math id="M62" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>L</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and
<inline-formula><mml:math id="M63" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>R</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the average <inline-formula><mml:math id="M64" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> in that node. The split is chosen<?pagebreak page4256?> among all candidate
variables and values to minimise this error. This splitting process is applied from the root to the
terminal node, which creates a tree structure for the model (Erdal and Karakurt, 2013). A CART can
be used both for classification and regression problems due to this tree structure (Coops et al.,
2011). However, a single CART can sometimes oversimplify variable interactions and may lead to low
prediction performance (McBratney et al., 2000; Cutler et al., 2007; Coopersmith et al., 2010). This
drawback can be overcome by the ensemble method that generates many resampled datasets and creates
various CARTs to achieve higher accuracy (Breiman, 2001) and more stable results when facing slight
variations in input data (Martínez-Rojas et al., 2015). New data input is thus evaluated
against all trees created in the ensemble model, and each tree votes using the main class or the
averaged values in the terminal node. The class with the maximum votes will be used for a
classification model, and the averaged predicted value from all trees is used for a regression model
(Singh et al., 2014; Belgiu and Drăgu, 2016). It is found that ensemble methods in RF and GBM
can significantly improve the prediction accuracy of CART (Ismail and Mutanga, 2010; Erdal and
Karakurt, 2013).</p>
      <p id="d1e1122">Compared to LM and WRTDS models, one drawback of RF and GBM, as well as many
ML methods in general, is that there is no specific equation in GBM or RF to
directly demonstrate model structures. However, GBM and RF do provide the
relative importance of each variable, which is based on the empirical
improvement in the loss function due to the split on the specific variable
in a tree (Povak et al., 2014;
Puissant et al., 2014). The improvement of a certain variable was averaged
over all trees and used as the relative importance of that variable for the
final model. This relative importance serves as the key index to understand
the model structure of RF and GBM
(Makler-Pick et al., 2011).</p>
</sec>
<sec id="Ch1.S2.SS3">
  <label>2.3</label><title>Baseflow separation</title>
      <p id="d1e1133">Total flow is commonly conceptualised as including baseflow and quickflow
components (Meshgi et al., 2015). Baseflow separation
techniques use the time-series record of streamflow to extract the baseflow
and quickflow signatures from the total flow. This can be done by using
graphical methods to identify the intersection between baseflow and the
rising and falling limbs of the quickflow response (Szilagyi
and Parlange, 1998) or by filtered methods which process the entire stream
hydrograph to derive a baseflow hydrograph (Furey and
Gupta, 2001). In this study, the three-pass filtered method was applied
for baseflow separation; the quickflow was first estimated as described
below (Lyne and Hollick, 1979; Nathan and McMahon,
1990), and then baseflow was calculated:
            <disp-formula id="Ch1.E5" content-type="numbered"><label>5</label><mml:math id="M65" display="block"><mml:mrow><mml:msub><mml:mtext>QF</mml:mtext><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mi mathvariant="italic">α</mml:mi><mml:msub><mml:mtext>QF</mml:mtext><mml:mrow><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi>Q</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>Q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>+</mml:mo><mml:mi mathvariant="italic">α</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M66" display="inline"><mml:mrow><mml:msub><mml:mtext>QF</mml:mtext><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the filtered quickflow for the <inline-formula><mml:math id="M67" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th sampling instant, <inline-formula><mml:math id="M68" display="inline"><mml:mrow><mml:msub><mml:mtext>QF</mml:mtext><mml:mrow><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>
is the filtered quickflow for the previous sampling instant to <inline-formula><mml:math id="M69" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M70" display="inline"><mml:mi mathvariant="italic">α</mml:mi></mml:math></inline-formula> is the filter parameter
with a value of 0.925 for daily flow as recommended by Nathan and McMahon (1990). Baseflow is then
calculated as <inline-formula><mml:math id="M71" display="inline"><mml:mrow><mml:mtext>BF</mml:mtext><mml:mo>=</mml:mo><mml:mi>Q</mml:mi><mml:mo>-</mml:mo><mml:mtext>QF</mml:mtext></mml:mrow></mml:math></inline-formula>.</p>
</sec>
<sec id="Ch1.S2.SS4">
  <label>2.4</label><title>Performance evaluation metrics</title>
      <p id="d1e1272">In this study, the root mean squared error (RMSE) and the Nash–Sutcliffe
model efficiency coefficient (MEF) were used to compare model performance.
The RMSE is a measure of overall error between the predicted and measured
data and returns an error value with the same units as the data, which is
given by the following equation:
            <disp-formula id="Ch1.E6" content-type="numbered"><label>6</label><mml:math id="M72" display="block"><mml:mrow><mml:mtext>RMSE</mml:mtext><mml:mo>=</mml:mo><mml:msqrt><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>∑</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow><mml:mi>n</mml:mi></mml:mfrac></mml:mstyle></mml:msqrt><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M73" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> is the number of data samples. RMSE varies from 0 to <inline-formula><mml:math id="M74" display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mi mathvariant="normal">∞</mml:mi></mml:mrow></mml:math></inline-formula>,
and a perfect model would have RMSE of 0. The MEF is a dimensionless
“goodness-of-fit” measure which can vary from <inline-formula><mml:math id="M75" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mi mathvariant="normal">∞</mml:mi></mml:mrow></mml:math></inline-formula> to 1, with a
value of 1 indicating a perfect fit and 0 indicating that the mean of the
measured values performs as well as the model. The MEF can be calculated as
            <disp-formula id="Ch1.E7" content-type="numbered"><label>7</label><mml:math id="M76" display="block"><mml:mrow><mml:mtext>MEF</mml:mtext><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>∑</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow><mml:mrow><mml:mo>∑</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M77" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the mean of the measured values. Note that the
predicted and measured nutrient values were normalised to [0, 1] in this
study to compare model performance across different nutrient species.</p>
</sec>
<sec id="Ch1.S2.SS5">
  <label>2.5</label><title>Overview of modelling processes</title>
      <p id="d1e1435">The main aims of this research is to test the hybrid model, rebuild the historical nutrient data,
and explore the short- and long-term nutrient changes. The first step is verifying the model
performance. In this case, the data were randomly divided into <inline-formula><mml:math id="M78" display="inline"><mml:mrow><mml:mn mathvariant="normal">80</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">20</mml:mn></mml:mrow></mml:math></inline-formula>. Different models were built
and tuned on the training dataset (80 %) and tested on the testing dataset (20 %). To further
test model uncertainty and stability, the divided and tested processes were repeated 30 times except for WRTDS. After this, all data points including the testing data were then used to rebuild the
historical nutrient data. Five-fold cross validation (CV) was done on the training dataset to tune
the model parameters. Leave-one-out cross validation (LOOCV) was used in WRTDS to predict daily
nutrient concentrations; LOOCV is the default cross-validation method<?pagebreak page4257?> in the EGRET (Exploration and Graphics for RivEr Trends) package. In that
method, one data point was excluded at a time from the whole dataset, all other data points were
used to build the model, and the excluded point was used for testing the model performance. This
process was repeated for all data points. The performance of all six methods (LM, WRTDS, RF, GBM,
hybrid RF and hybrid GBM) was evaluated on the testing dataset. WRTDS was run through the EGRET package (Hirsch and De Cicco, 2015) in R to produce
daily concentrations for six nutrient species (TP, TN, DON, DOC, <inline-formula><mml:math id="M79" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and FRP). The
default settings specified by the user guide (Hirsch and De Cicco, 2015) were used. RF and GBM
models were built through the <inline-formula><mml:math id="M80" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">H</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi mathvariant="normal">O</mml:mi></mml:mrow></mml:math></inline-formula> package in R.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T1" specific-use="star"><?xmltex \currentcnt{1}?><label>Table 1</label><caption><p id="d1e1477">Variable list and descriptions.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="5">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="left"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:colspec colnum="5" colname="col5" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Variable type</oasis:entry>
         <oasis:entry colname="col2">Variable name</oasis:entry>
         <oasis:entry colname="col3">Abbreviation</oasis:entry>
         <oasis:entry colname="col4">Unit</oasis:entry>
         <oasis:entry colname="col5">Data source (last access: 9 September 2020)</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Hydrological  data</oasis:entry>
         <oasis:entry colname="col2">Total discharge</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M81" display="inline"><mml:mi>Q</mml:mi></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M82" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">m</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msup><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">s</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><uri>http://wir.water.wa.gov.au</uri></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Average total discharge in last <inline-formula><mml:math id="M83" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> days</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M84" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>Q</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M85" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">m</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msup><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">s</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Lagged average</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Quickflow</oasis:entry>
         <oasis:entry colname="col3">QF</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M86" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">m</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msup><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">s</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Equation (5)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Average quickflow in last <inline-formula><mml:math id="M87" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> days</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M88" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>QF</mml:mtext><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M89" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">m</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msup><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">s</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Lagged average</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Baseflow</oasis:entry>
         <oasis:entry colname="col3">BF</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M90" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">m</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msup><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">s</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Equation (5)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Average quickflow in last <inline-formula><mml:math id="M91" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> days</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M92" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>BF</mml:mtext><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M93" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">m</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msup><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">s</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Lagged average</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Seasonal components</oasis:entry>
         <oasis:entry colname="col2">Julian day</oasis:entry>
         <oasis:entry colname="col3"><italic>JD</italic></oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">Recorded</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Cos (Julian day)</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M94" display="inline"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:mo>(</mml:mo><mml:mtext>JD</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">Calculated</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Sin (Julian day)</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M95" display="inline"><mml:mrow><mml:mi>sin⁡</mml:mi><mml:mo>(</mml:mo><mml:mtext>JD</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">Calculated</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Metrological data</oasis:entry>
         <oasis:entry colname="col2">rainfall</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M96" display="inline"><mml:mi>P</mml:mi></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4">mm</oasis:entry>
         <oasis:entry colname="col5"><uri>http://www.bom.gov.au</uri></oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Cumulated rainfall in last <inline-formula><mml:math id="M97" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> days</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M98" display="inline"><mml:mrow><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mi>X</mml:mi></mml:munderover><mml:mi>P</mml:mi></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4">mm</oasis:entry>
         <oasis:entry colname="col5">Lagged sum</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Nutrient data</oasis:entry>
         <oasis:entry colname="col2">Total nitrogen</oasis:entry>
         <oasis:entry colname="col3">TN</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M99" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><uri>http://wir.water.wa.gov.au</uri></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Total phosphorus</oasis:entry>
         <oasis:entry colname="col3">TP</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M100" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><uri>http://wir.water.wa.gov.au</uri></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Dissolved organic carbon</oasis:entry>
         <oasis:entry colname="col3">DOC</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M101" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><uri>http://wir.water.wa.gov.au</uri></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Dissolved organic nitrogen</oasis:entry>
         <oasis:entry colname="col3">DON</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M102" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><uri>http://wir.water.wa.gov.au</uri></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Ammonia</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M103" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M104" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><uri>http://wir.water.wa.gov.au</uri></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Filterable reactive phosphorus</oasis:entry>
         <oasis:entry colname="col3">FRP</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M105" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><uri>http://wir.water.wa.gov.au</uri></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Generated dissolved organic nitrogen</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M106" display="inline"><mml:mrow><mml:msub><mml:mtext>DON</mml:mtext><mml:mtext>generated</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M107" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Generated by the intermediate model</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Generated total phosphorus</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M108" display="inline"><mml:mrow><mml:msub><mml:mtext>TP</mml:mtext><mml:mtext>generated</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M109" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Generated by the intermediate model</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Generated dissolved organic carbon</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M110" display="inline"><mml:mrow><mml:msub><mml:mtext>DOC</mml:mtext><mml:mtext>generated</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M111" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5">Generated by the intermediate model</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d1e2263">The overall processes of ML-SWAN can be divided into three stages (Fig. 1). The first stage was
baseflow separation using the EcoHydRology package (Fuka et al., 2018). The generated baseflow,
quickflow, total flow and rainfall were further transformed into lagged data (the averaged values
over the previous 3, 7 and 15 days) to capture any short-term impacts of different water pathways
and rainfall on stream nutrients. JD, <inline-formula><mml:math id="M112" display="inline"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:mo>(</mml:mo><mml:mtext>JD</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M113" display="inline"><mml:mrow><mml:mi>sin⁡</mml:mi><mml:mo>(</mml:mo><mml:mtext>JD</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> were
also calculated for RF and GBM to include seasonal and long-term impacts. A description of all the variables used is given in Table 1.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F1" specific-use="star"><?xmltex \currentcnt{1}?><label>Figure 1</label><caption><p id="d1e2297">Overall modelling processes of ML-SWAN.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020-f01.png"/>

        </fig>

      <p id="d1e2306">The second stage of ML-SWAN was to build intermediate RF and GBM models that generated daily
nutrient concentrations. For the intermediate RF and GBM models, only lagged hydrological data
(including total flow, baseflow and quickflow), lagged rainfall and seasonal components on the
training dataset were used. Nutrients were not used as a predictor in the intermediate model.  Note
that, in this study TP, TN, DOC and DON were selected to be generated in the second step. If one
nutrient was considered as the final target, the other three nutrients were used to generate daily
data. For instance, daily TP, DOC and DON were generated as additional variables to predict TN. In
that case, the missing TP, DOC and DON were generated by the intermediate model for the training
dataset and the testing dataset. Daily TN, TP, DOC and DON data were generated and used for the
final predictions. These nutrients were selected since they may be generated from similar sources or are important components of the total nutrient load. For instance, DOC and DON may both be
generated from dissolved organic matter (DOM) (Seitzinger et al., 2002; Bernal et al., 2005; Filep and Rékási,
2011). In the catchments studied here, DON can be a dominant component of TN (Nice et al., 2009;
Petrone, 2010; Bourke et al., 2015). The selection of DOC and DON for pre-generation may not
necessarily be appropriate for other catchments. The selection of nutrients for pre-generation
depends on data availability in the dataset.  The use of different species of the same nutrients (N
or P) can generally improve model performance.</p>
      <p id="d1e2309">The third stage of ML-SWAN built an additional hybrid model using the training data, which has
generated nutrient data by the intermediate models, lagged hydrological data, lagged rainfall data
and seasonal components. Note that at this stage, the only difference between stand-alone ML and
hybrid ML methods was that stand-alone ML did not use pre-generated daily nutrient data.</p>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Site overview</title>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T2" specific-use="star"><?xmltex \currentcnt{2}?><label>Table 2</label><caption><p id="d1e2324">Hydrological characteristics of the two tributaries.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="5">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="left"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:colspec colnum="5" colname="col5" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Site</oasis:entry>
         <oasis:entry colname="col2">Hydrological type</oasis:entry>
         <oasis:entry colname="col3">Annual flow (GL)</oasis:entry>
         <oasis:entry colname="col4">Area (<inline-formula><mml:math id="M114" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">km</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>)</oasis:entry>
         <oasis:entry colname="col5">Land use</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Ellen Brook</oasis:entry>
         <oasis:entry colname="col2">Ephemeral</oasis:entry>
         <oasis:entry colname="col3">26.7</oasis:entry>
         <oasis:entry colname="col4">716</oasis:entry>
         <oasis:entry colname="col5">Rural, agricultural and grazing</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Murray River</oasis:entry>
         <oasis:entry colname="col2">Perennial</oasis:entry>
         <oasis:entry colname="col3">360</oasis:entry>
         <oasis:entry colname="col4">7855</oasis:entry>
         <oasis:entry colname="col5">Agricultural and natural reserves</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d1e2414">To test the generalisability of the hybrid framework, two sites in Western Australia (Ellen Brook
and Murray River) were selected as study areas. Ellen Brook and Murray River are key tributaries for
the Swan–Canning Estuary and Peel–Harvey Estuary (Fig. 2), respectively, and have different hydrological
conditions. The Swan–Canning Estuary is located adjacent to the Perth metropolitan area, with an
area of approximately 40 <inline-formula><mml:math id="M115" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">km</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>. The catchment comprises 30 catchments, which drain
approximately 2090 <inline-formula><mml:math id="M116" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">km</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> (Kelsey et al., 2010). Ellen Brook is the largest sub-catchment in
the Swan–Canning catchment, comprising 34 % (716 <inline-formula><mml:math id="M117" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">km</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>) of the total catchment
area. Ellen Brook is an ephemeral river with no flow recorded during summer and the early autumn months
(Table 2). The dominant land use in Ellen Brook is agricultural and grazing land. Ellen Brook is one
of the highest contributors of TN and TP to the Swan–Canning Estuary (Swan River Trust,
2009). Bassendean sands and duplex Yanga (sand over clay) soils dominate the Ellen Brook
catchment. Bassendean sands have very low phosphorus retention indices (PRIs), while Yanga soils have
low PRIs in their upper horizon and become waterlogged in winter, promoting the release of retained
nutrients to the stream (Kelsey et al., 2010).</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F2"><?xmltex \currentcnt{2}?><label>Figure 2</label><caption><p id="d1e2452">The location of Ellen Brook and Murray River.</p></caption>
        <?xmltex \igopts{width=236.157874pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020-f02.png"/>

      </fig>

      <p id="d1e2462">The Peel–Harvey Estuary is located approximately 75 <inline-formula><mml:math id="M118" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">km</mml:mi></mml:mrow></mml:math></inline-formula> south of the Swan–Canning Estuary, and the Serpentine, Murray and Harvey Rivers drain into the estuary (Fig. 2). The total catchment
area of the estuary is approximately 11 930 <inline-formula><mml:math id="M119" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">km</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>. The Murray River catchment is dominated
by deep grey sands, loams clay and peats (Ruibal-Conti et al., 2013), agricultural land use, and
natural reserves, and it contributes about 40 % of annual TN loads and 7 % of annual TP
loads to the estuary (Kelsey et al., 2011).</p>

<?xmltex \floatpos{p}?><table-wrap id="Ch1.T3"><?xmltex \currentcnt{3}?><label>Table 3</label><caption><p id="d1e2487">Nutrient sampling time and sample size in Ellen Brook and Murray
River.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="left"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Site</oasis:entry>
         <oasis:entry colname="col2">Nutrient</oasis:entry>
         <oasis:entry colname="col3">First measurement</oasis:entry>
         <oasis:entry colname="col4">Sample size</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Ellen Brook</oasis:entry>
         <oasis:entry colname="col2">TN</oasis:entry>
         <oasis:entry colname="col3">1990</oasis:entry>
         <oasis:entry colname="col4">1057</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">TP</oasis:entry>
         <oasis:entry colname="col3">1990</oasis:entry>
         <oasis:entry colname="col4">1022</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">DOC</oasis:entry>
         <oasis:entry colname="col3">1995</oasis:entry>
         <oasis:entry colname="col4">297</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">DON</oasis:entry>
         <oasis:entry colname="col3">2006</oasis:entry>
         <oasis:entry colname="col4">129</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">FRP</oasis:entry>
         <oasis:entry colname="col3">1990</oasis:entry>
         <oasis:entry colname="col4">404</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M120" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3">1990</oasis:entry>
         <oasis:entry colname="col4">356</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Murray River</oasis:entry>
         <oasis:entry colname="col2">TN</oasis:entry>
         <oasis:entry colname="col3">1983</oasis:entry>
         <oasis:entry colname="col4">1648</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">TP</oasis:entry>
         <oasis:entry colname="col3">1983</oasis:entry>
         <oasis:entry colname="col4">1662</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">DOC</oasis:entry>
         <oasis:entry colname="col3">2006</oasis:entry>
         <oasis:entry colname="col4">209</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">DON</oasis:entry>
         <oasis:entry colname="col3">2006</oasis:entry>
         <oasis:entry colname="col4">207</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">FRP</oasis:entry>
         <oasis:entry colname="col3">1990</oasis:entry>
         <oasis:entry colname="col4">300</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M121" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3">1983</oasis:entry>
         <oasis:entry colname="col4">570</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d1e2715">Both Swan–Canning Estuary and Peel–Harvey Estuary experience a Mediterranean climate with cool, wet
winters (June–August) and hot, dry summers (December–March). The long-term average annual rainfall
varies from 1300 <inline-formula><mml:math id="M122" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi></mml:mrow></mml:math></inline-formula> on the coast to 800 <inline-formula><mml:math id="M123" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi></mml:mrow></mml:math></inline-formula> in the south-east of the catchment area
(1975–2009, Bureau of Meteorology station), and about 90 % of the rain falls between April and
October. Sample size and the first measurement year of six nutrients species are listed for the two
study sites in Table 3. TN, TP, <inline-formula><mml:math id="M124" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and FRP have been monitored for decades, while DOC
and DON have only been measured in recent years, with limited sample size. Several historical
nutrient datasets were combined but significant changes occurred in water sampling devices and
analytical instrumentation over the past decades. These changes can increase the complexity of
nutrient data.<?pagebreak page4258?> For instance, auto-samplers sampled any time regardless of weather conditions (e.g.
during the rainfall), while grab samples were typically collected under fine weather conditions due
to safety concerns.</p>
</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Results</title>
<sec id="Ch1.S4.SS1">
  <label>4.1</label><title>Comparison of prediction accuracy between six methods</title>
      <?pagebreak page4259?><p id="d1e2760">Overall, the scaled RMSE reduced from LM, WRTDS, stand-alone ML and hybrid ML for all nutrients
except <inline-formula><mml:math id="M125" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, and the same pattern was found for MEF in both Ellen Brook and Murray River
(Fig. 3). The linear model had the worst performance: the scaled RMSE was significantly higher and
MEF was significantly lower than the other models, for all six nutrients and across both
sites. WRTDS generally had higher RMSE and lower MEF than the stand-alone ML, although it achieved
similar results to stand-alone ML for FRP and <inline-formula><mml:math id="M126" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> at both sites. LOOCV was used in WRTDS, and only one set of results was generated, compared to 30 RMSE and MEF values for other
methods. This results in a shortened line for WRTDS in Fig. 3, instead of the interquartile ranges
(<inline-formula><mml:math id="M127" display="inline"><mml:mrow><mml:mtext>IQR</mml:mtext><mml:mo>=</mml:mo><mml:mn mathvariant="normal">75</mml:mn></mml:mrow></mml:math></inline-formula>th percentile <inline-formula><mml:math id="M128" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula> 25th percentile) presented for the other methods. LOOCV
can sometimes overestimate the model performance as only one sample was tested at a time; in
contrast, 20 % of the independent testing data were tested in the other five models. LOOCV can
also have a higher variance than other CV methods (Li, 2016). As such, the WRTDS results are not
directly comparable to the other methods.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F3" specific-use="star"><?xmltex \currentcnt{3}?><label>Figure 3</label><caption><p id="d1e2806">Model performance across six nutrients and the two sites: <bold>(a)</bold> RMSE
and <bold>(b)</bold> MEF for Ellen Brook; <bold>(c)</bold> RMSE and <bold>(d)</bold> MEF results for Murray River.</p></caption>
          <?xmltex \igopts{width=483.69685pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020-f03.png"/>

        </fig>

      <p id="d1e2827">Stand-alone ML achieved results that placed it between WRTDS and hybrid ML.  Stand-alone GBM
achieved the highest accuracy for <inline-formula><mml:math id="M129" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> prediction in Murray River. Hybrid RF and hybrid GBM
had the lowest RSME and highest MEF for all nutrients except <inline-formula><mml:math id="M130" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, in Ellen Brook and
Murray River (Fig. 3). Compared to the stand-alone ML, the hybrid ML also had much lower prediction
uncertainty, in that the RMSE and MEF had narrower IQR than that of the stand-alone ML, especially
for DON and FRP prediction in Ellen Brook and DOC prediction in Murray River. The use of
pre-generated daily nutrient data was the only difference between hybrid ML and stand-alone ML. This
means that the generated nutrients provided additional information for the hybrid model that allowed
more stable results. Interestingly, while the hybrid ML had better performance than the stand-alone
ML, there was no significant difference in performance between the hybrid RF and hybrid GBM, though
they showed differences between different nutrient species. For instance, hybrid RF achieved
slightly better performance for DOC in Ellen Brook, while hybrid GBM had lower RMSE for DOC in
Murray River. There was no significant performance difference between stand-alone RF and GBM.</p>
      <p id="d1e2853">In summary, the hybrid ML had the best performance amongst the six methods,
followed by stand-alone RF and GBM. WRTDS was better than the linear model
but could only achieve results similar to stand-alone RF and GBM for
<inline-formula><mml:math id="M131" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> prediction in Ellen Brook and for <inline-formula><mml:math id="M132" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and FRP prediction in
Murray River.</p>
</sec>
<sec id="Ch1.S4.SS2">
  <label>4.2</label><title>Generated daily TN in Ellen Brook</title>
      <p id="d1e2886">Model performance for six nutrients was compared in the last section. To make
this section more concise, these six models were then compared in their
ability to generate daily TN in Ellen Brook from 1 January 1989 to 16 July 2018
(Fig. 4). The daily TN in Murray River and daily TP in both sites were
also generated (see results in the Supplement). TN was selected
because TN is the most important and most frequently<?pagebreak page4260?> measured nutrient in
many places. This hybrid method can also be used for other nutrients. Note
that all data points (not just the 80 % training dataset) were used to
generate daily TN.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F4" specific-use="star"><?xmltex \currentcnt{4}?><label>Figure 4</label><caption><p id="d1e2891">Daily TN generated by the six models for Ellen Brook.</p></caption>
          <?xmltex \igopts{width=412.564961pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020-f04.png"/>

        </fig>

      <p id="d1e2900">The LM performed very poorly for TN prediction; low-concentration samples
(<inline-formula><mml:math id="M133" display="inline"><mml:mrow><mml:mtext>TN</mml:mtext><mml:mo>&lt;</mml:mo><mml:mn mathvariant="normal">1.9</mml:mn><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mrow></mml:math></inline-formula>) were all underestimated, and some extremely high
concentrations were incorrectly generated due to the high flow (Fig. 4a). There were some seasonal
patterns in the generated TN which come from the flow data. LM only used total flow to predict
nutrient concentrations, while other important hydrological processes were ignored. Thus the
oversimplified LM had high errors in nutrient prediction (Fig. 4), and this method might be more
suitable for solutes that are not substantially bioactive (e.g. <inline-formula><mml:math id="M134" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">SiO</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M135" display="inline"><mml:mrow class="chem"><mml:msup><mml:mi mathvariant="normal">Ca</mml:mi><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>,
<inline-formula><mml:math id="M136" display="inline"><mml:mrow class="chem"><mml:msup><mml:mi mathvariant="normal">Mg</mml:mi><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M137" display="inline"><mml:mrow class="chem"><mml:msup><mml:mi mathvariant="normal">Cl</mml:mi><mml:mo>-</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>) (Stallard and Murphy, 2014). The WRTDS captured some seasonal
patterns of TN (from 2008 to 2018) but still had problems predicting TN between 1989 and 1996; some
extremely high values were generated, and <inline-formula><mml:math id="M138" display="inline"><mml:mrow><mml:mtext>TN</mml:mtext><mml:mo>&lt;</mml:mo><mml:mn mathvariant="normal">1.0</mml:mn><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mrow></mml:math></inline-formula> were
overestimated. Some high values (e.g. TN in 2008) were underestimated (Fig. 4b). Stand-alone ML
and hybrid ML generated similar daily TN data but varied in the detail. These models successfully
captured the low-concentration data and the seasonal pattern of TN. Unlike results by WRTDS, the
generated TN by stand-alone ML and hybrid ML have a more consistent seasonal pattern from 1989 to
2018. The RF and hybrid RF both underestimated a few high-concentration data
(<inline-formula><mml:math id="M139" display="inline"><mml:mrow><mml:mtext>TN</mml:mtext><mml:mo>&lt;</mml:mo><mml:mn mathvariant="normal">4.0</mml:mn><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mrow></mml:math></inline-formula>), compared to GBM and hybrid GBM, although hybrid RF still
showed better performance than RF. For instance, high-concentration data in 2007 and again from 2014
to 2017 were successfully predicted by hybrid RF but underestimated by RF. Compared to stand-alone
GBM, the hybrid GBM achieved lower errors for high-concentration data.</p>
      <p id="d1e3030">Apart from the better performance for high-concentration data, another difference between
stand-alone ML and hybrid ML was that the long-term trend in TN was consistent in stand-alone ML,
but this trend fluctuated in hybrid ML. For instance, hybrid GBM results fluctuated from 1989 to
1999 and then showed an increasing long-term trend from 2005 to 2018, in addition to the seasonal
fluctuation. The pre-generated nutrient is the only difference between stand-alone model and hybrid
model. If there are long-term trends in nutrient concentrations (e.g. TN), similar trends should
also exist in the components of TN (either DON or dissolved inorganic nitrogen). The pre-generated
nutrients emphasise this impact on the hybrid model. This suggests that the generated nutrient data
could provide additional information that allowed the hybrid ML to capture long-term trends; this
information was not included in the seasonal components but existed in the generated nutrient data.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F5"><?xmltex \currentcnt{5}?><label>Figure 5</label><caption><p id="d1e3035">The distribution of the daily TN generated by the six models and that of the measured TN data in Ellen Brook.</p></caption>
          <?xmltex \igopts{width=236.157874pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020-f05.png"/>

        </fig>

      <p id="d1e3044">The distribution of the TN data generated by the six models was compared to the distribution of the
measured TN data (Fig. 5). Similar to the results shown in Fig. 4, hybrid GBM had the most similar
distribution to the measured TN data. Only a few low- and high-concentration data were incorrectly
predicted by the hybrid GBM. Hybrid RF also achieved a distribution similar to the measured data,
but more extreme-value data were underestimated compared to the hybrid GBM. Stand-alone GBM and RF
showed a similar distribution to the hybrid GBM and RF with less accuracy in the extreme
data. Overall, GBM (either stand-alone model or hybrid model) could have a better distribution than
RF. WRTDS generated some extremely high data and underestimated many low-concentration data, which
is also seen in Fig. 4b. The linear model incorrectly predicted most of the TN data. The results
in both Figs. 4 and 5 showed that hybrid GBM achieved the best simulated daily TN data, followed by
hybrid RF, stand-alone GBM and RF. WRTDS and LM generated large biases in TN prediction.</p>
      <p id="d1e3047">The hybrid ML models predicted most of the extreme concentrations (Figs. 4 and 5), and only a few
points were under-predicted. The limited number of extreme data and the model structure that tried
to balance the overall trend prediction with extreme data prediction can cause under-prediction. For example, higher weights can be set up for extreme data during the model
training process to force model to over-predict<?pagebreak page4261?> the value for extreme concentrations, which may
reduce the accuracy for overall trend prediction. In this study, our target is to understand the
long-term nutrient trend. Therefore, we did not use this technique during the model training
process.</p>
</sec>
<sec id="Ch1.S4.SS3">
  <label>4.3</label><title>Comparison of variable importance in hybrid GBM for TN prediction</title>
      <p id="d1e3058">The daily data generated by the hybrid GBM showed a lower RMSE and better distribution than
stand-alone ML, WRTDS and LM (Figs. 4 and 5).  Compared to LM, WRTDS and simple CART models, one
drawback of RF and GBM, as well as many ML methods in general, is that there is no specific equation
in GBM or RF to directly demonstrate model structures. However, GBM and RF do provide the relative
importance of each variable, which is based on the empirical improvement in the loss function due to
the split on the specific variable in a tree (Povak et al., 2014; Puissant et al., 2014). The
improvement of a certain variable was averaged over all trees as the relative importance for the
final model. This relative importance serves as the key index to understanding the model structure
of RF and GBM (Makler-Pick et al., 2011).</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F6" specific-use="star"><?xmltex \currentcnt{6}?><label>Figure 6</label><caption><p id="d1e3063">Variable importance in the hybrid GBM for TN prediction in <bold>(a)</bold>
Ellen Brook and <bold>(b)</bold> Murray River.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020-f06.png"/>

        </fig>

      <p id="d1e3078">The variable importance for TN prediction by hybrid GBM in Ellen Brook and Murray River is presented
in Fig. 6. The variable importance in the intermediate models is also included, and the length of
coloured sections represents the importance of those variables in the hybrid GBM or intermediate
GBM. The importance was scaled according to the most important variable. The generated DON and TP
ranked as the first two critical variables in Ellen Brook, while all three generated nutrients were
listed as the most important variables in Murray River. This suggests that the generated nutrients
do provide critical information to the model and improve model performance. The quickflow was most
important for the generated DON and TP, as well as the TN itself in Ellen Brook. The impacts of
quickflow decreased, and baseflow, seasonal<?pagebreak page4262?> components and rainfall data become more important for
TN prediction in Murray River. This difference in variable importance reflects different catchment
characteristics across the two sites and therefore different hydrological and hydrochemical
processes controlling TN concentrations. The total flow was not of high importance at either site,
which suggests that baseflow or quickflow had more impact on surface water TN. Moreover, TN
concentrations were affected by more variables in Murray River than in Ellen Brook.</p>
</sec>
</sec>
<sec id="Ch1.S5">
  <label>5</label><title>Discussion</title>
<sec id="Ch1.S5.SS1">
  <label>5.1</label><title>Different sources of TN in Ellen Brook and Murray River</title>
      <p id="d1e3098">Hydrological conditions, specific sub-catchment characteristics and the chemical properties of
nutrients can all impact surface water nutrient concentrations (Barron et al., 2009; Moatar et al.,
2016), nutrient partitioning (Ruibal-Conti et al., 2013) and nutrient transport (Burt and Pinay,
2005; Tesoriero et al., 2009). TN prediction in Murray River was impacted by more variables than in
Ellen Brook (Fig. 6), suggesting more complex relationships in Murray River.</p>
      <?pagebreak page4263?><p id="d1e3101">Quick flow is composed of runoff, interflow and direct precipitation (Brodie and Hostetler,
2005) and was shown to be
important for TN prediction in Ellen Brook. Direct precipitation, however, did not have a large
impact on TN (the green bars in Fig. 6); this suggests that runoff and interflow were important for
TN concentrations. Baseflow can account for (on average) 53 % of annual stream discharge in
Ellen Brook, but baseflow was not of high importance for TN prediction in this study. This may occur
due to low TN concentrations in the baseflow (Barron et al., 2009), large areas of low
nutrient-retaining sandy soils in the Ellen Brook catchment, and high nutrient transport efficiency in
quickflow and first flush.  Mellander et al.  (2012) quantified nutrient transport pathways in
agricultural catchments and found that quickflow was only 2 %–8 % of total flow, but it can
transport up to 50 % of TP. Gunaratne et al. (2017) found that the seasonal first flush was
only 30 % of runoff volume but contained 40 %–70 % of the nutrient load.</p>
      <p id="d1e3104">Note that the median TN in Ellen Brook (2.1 <inline-formula><mml:math id="M140" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>) is significantly higher than that in
Murray River (0.67 <inline-formula><mml:math id="M141" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mg</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">L</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>) which can be explained to some extent by the large area of
grazing lands in Ellen Brook. Previous investigations in south-eastern Australia (Adams et al.,
2014), New Zealand (Davies-Colley et al., 2004) and north-western Europe (Conroy et al., 2016) all
suggested that livestock can increase TN discharge to the receiving water bodies. Most of the
piggeries and poultry farms in the Swan–Canning catchment are located in Ellen Brook catchment
(Kelsey et al., 2010), which has the highest TN and TP discharge loads. Thus the large grazing
areas, piggeries and poultry farms and low nutrient-retaining sandy soils may explain the
importance of quickflow for TN prediction and high TN concentrations in Ellen Brook.</p>
      <p id="d1e3141">Baseflow is derived from groundwater discharge to streams and the slow drainage of water stored in
local wetlands (Kelsey et al., 2010). Baseflow is highlighted as an important variable for TN
prediction in Murray River. The Murray River catchment has large areas with high nutrient-retaining
soils (high PRI) (Kelsey et al., 2011) and relatively low TN concentrations, and it is likely that
groundwater makes significant contributions to TN in Murray River. Ruibal-Conti et
al. (2013) previously found that variability in
TN is strongly associated with variability in flows in Murray River. Our results extend this
finding, in that both baseflow and quickflow likely impact TN in the river.</p>
      <p id="d1e3145">It is noted that seasonal components including <inline-formula><mml:math id="M142" display="inline"><mml:mrow><mml:mi>sin⁡</mml:mi><mml:mo>(</mml:mo><mml:mtext>JD</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M143" display="inline"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:mo>(</mml:mo><mml:mtext>JD</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> showed
significantly higher importance in Murray River. This may because seasonal information is captured
in other inputs in Ellen Brook (e.g. quickflow and baseflow). But the main reason is the stronger
seasonal TN signals in Murray River compared to Ellen Brook. This finding is supported by the
generated daily TN data for Murray River (see results in Supplement S2). Natural reserves occupy
large areas of the Murray River catchment, and this may increase seasonal signals.  Additionally,
the lagged quickflow, baseflow and rainfall were generated (for the previous 3, 7 and
15 <inline-formula><mml:math id="M144" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">days</mml:mi></mml:mrow></mml:math></inline-formula>), but only the lagged 15 <inline-formula><mml:math id="M145" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">d</mml:mi></mml:mrow></mml:math></inline-formula> baseflow and quickflow were ranked as important
variables for both Ellen Brook and Murray River. This suggests a timescale of nutrient
transport in the sub-catchments and likely reflects soil permeability and geology; long
hydrochemical recessions from storm events may prolong their impact on the ecological status of
receiving rivers (Mellander et al., 2012).</p>

      <?xmltex \floatpos{p}?><fig id="Ch1.F7" specific-use="star"><?xmltex \currentcnt{7}?><label>Figure 7</label><caption><p id="d1e3194">Generated daily <inline-formula><mml:math id="M146" display="inline"><mml:mrow class="chem"><mml:mi mathvariant="normal">DON</mml:mi></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M147" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NO</mml:mi><mml:mtext mathvariant="italic">x</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M148" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> by the hybrid GBM for Ellen Brook.</p></caption>
          <?xmltex \igopts{width=483.69685pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020-f07.png"/>

        </fig>

      <p id="d1e3233">Six models were compared for nutrient predictions and the hybrid GBM model achieved the highest
accuracy (Figs. 3 and 5). The long-term changes in TN have been discussed in previous sections. To
understand the long-term changes in other nitrogen species across the year, the hybrid GBM was then
applied to generate daily DON, <inline-formula><mml:math id="M149" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M150" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NO</mml:mi><mml:mtext mathvariant="italic">x</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> in Ellen Brook from
1 January 1989 to 16 July 2018 (Fig. 7).  The generated DON has much higher concentration than
<inline-formula><mml:math id="M151" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M152" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NO</mml:mi><mml:mtext mathvariant="italic">x</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>. This is consistent with previous investigations in this
study area that DON was the dominant form of TN in both surface water and groundwater (Nice et al.,
2009; Petrone, 2010; Bourke et al., 2015). There is no clear long-term patterns in generated
<inline-formula><mml:math id="M153" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M154" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NO</mml:mi><mml:mtext mathvariant="italic">x</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>; however, an increasing long-term trend in generated DON
can be found from 2006 to 2018. There is also an increasing trend in TN from 2005 (Fig. 4),
suggesting DON was the main reason for the increasing TN concentrations.  DON is often assumed to be
relatively slow to react, but depending on the source of DON, it can turnover rapidly, thereby
constituting an active contributor to the eutrophication of surface waters (Petrone et al., 2009).</p>
</sec>
<sec id="Ch1.S5.SS2">
  <label>5.2</label><title>Can we improve our understanding of historical nutrient conditions using a contemporary data?</title>
      <p id="d1e3311">The generated nutrient data provided additional information to enhance the hybrid model performance
(Figs. 3 and 5). To assess the individual impact of a generated nutrient, we did<?pagebreak page4264?> a simple test that
sequentially added generated TP, DOC and DON data to the base GBM (only seasonal components and
lagged hydrological data) and evaluated RMSE and MEF for TN prediction.  This process was repeated
30 times and the results are presented in Fig. 8.</p>

      <?xmltex \floatpos{p}?><fig id="Ch1.F8" specific-use="star"><?xmltex \currentcnt{8}?><label>Figure 8</label><caption><p id="d1e3316">Model performance for TN prediction across different input variables for Ellen Brook.</p></caption>
          <?xmltex \igopts{width=426.791339pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/13/4253/2020/gmd-13-4253-2020-f08.png"/>

        </fig>

      <p id="d1e3325">The RMSE significantly decreased when generated TP was added as an additional variable. DOC and DON
only have 297 and 129 data, respectively, and were only measured in recent years, while TP has more
than 1000 data and has been measured since 1990 (Table 3). However, DOC and DON could still
improve model performance (Fig. 8), and the generated DON was ranked as the most important variable
across both sites (Fig. 6). The medium RMSE slightly decreased when both generated DOC and DON were
added. Moreover, the generated DOC and DON also reduced the model uncertainty, such that the IQRs became narrower than model results without the generated nutrients.</p>
      <p id="d1e3329">Our results suggest that the recent DON and DOC data improved understanding of historical TN. It is
not uncommon to have a similar data structure when several datasets are combined or new
measurements are added to a project.  While there were no DON data prior to 2006 in Ellen Brook,
daily DON can be generated back to 1990 with the help of generated TN, DOC and TP data; DON had the
highest MEF among the six nutrients (Fig. 3). This hybrid method provides a feasible process to
fully utilise all available nutrient data to accurately fill gaps in either historical or recent
nutrient datasets.</p>
</sec>
<sec id="Ch1.S5.SS3">
  <label>5.3</label><title>A comprehensive comparison of six models</title>
      <?pagebreak page4266?><p id="d1e3340">Monitoring, modelling and forecasting water quality inputs are essential to support the management
of the quality of receiving waters while responding to current anthropogenic stressors
(Holguin-Gonzalez et al., 2013; Schnoor, 2014). The performances of six models were comprehensively
compared, in an exploration of historical and contemporary nutrient data across two study sites. LM
had the highest error while stand-alone RF and GBM had similar error. This agrees with previous
findings by Erdal and Karakurt (2013) that RF and GBM models achieved similar correlation
coefficients (<inline-formula><mml:math id="M155" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>) for streamflow forecasting. Ismail and Mutanga (2010) also reported that RF and GBM
increased the <inline-formula><mml:math id="M156" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> of a single CART by 10.01 % and 9.59 %, respectively.</p>
      <p id="d1e3357">The performance of WRTDS, as well as many conceptual models, is often reliant on a prescribed set of
input information, which can account for variance in nutrient concentrations but may miss some
important processes for certain rivers (e.g. baseflow in this study). This can compromise the
performance of WRTDS for nutrient prediction. Moreover, hydrological and chemical processes within
the systems are typically ignored by many conceptual models, which may exclude important
hydrochemical information. By contrast, some complex conceptual models may include these
hydrochemical processes but are often constrained by insufficient nutrient data to calibrate and
validate the models. Some simplifications may be made to account for lack of data, but the
simplifications may often weaken model performance. The hybrid framework presented in this study has
overcome the challenge caused by data paucity by building intermediate models to generate missing
nutrient data and then using this additional hydrochemical information to improve final model
performance.</p>
      <p id="d1e3360">The hybrid models developed in this study were able to take advantage of the complementary strengths
of both hydrochemical (additionally generated nutrient data) and hydrological (lagged data)
information. This was particularly the case for the prediction of high nutrient concentrations, where
the hybrid models were shown to outperform the stand-alone RF and GBM, in terms of accuracy,
reliability and value distribution. Improved accuracy in the hybrid model was achieved by using
intermediate models, although these intermediate models may also have a relatively high error
(similar to stand-alone RF and GBM). However, if the improved model performance is higher than the
introduced error, the results are manageable. Similar results were also found in Hunter
et al. (2018), who compared a hybrid process-driven and ANN model with the stand-alone ANN model
and the process-driven model. In their study, the hybrid also achieved the best performance followed
by stand-alone ANN. The process-driven benchmark model had a significantly lower accuracy than the other
two models.</p>
      <p id="d1e3363">A limitation of the hybrid modelling approach, however, is that it requires the time and expertise
to develop intermediate models for generating additional nutrient data. Prior knowledge also plays
an important role in identifying the variables for pre-generation. Some statistical methods (e.g.
the correlation test, simple linear model) can be helpful to identify these variables if there is no
clear theoretical or conceptual understanding on which to base the selection of the important variables.</p>
      <p id="d1e3367">In this study, we tested the generalised performance of the hybrid model across six nutrient species
and two tributaries. We also note that nutrients may not always be the critical variables targeted
for pre-generation; the pre-generated DOC was ranked as having low importance for Ellen Brook and
produced only a slight improvement in the performance of the hybrid model for <inline-formula><mml:math id="M157" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">NH</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>.</p>
</sec>
<sec id="Ch1.S5.SS4">
  <label>5.4</label><title>The application of ML methods for hydrological modelling</title>
      <p id="d1e3389">There were constraints in the nutrient datasets in this research, and similar constraints commonly
exist in other study areas. Many nutrient datasets contain important information, but sometimes it
can be challenging to directly combine or utilise them. ML methods provide a feasible approach to
preprocess these datasets or combine them. In this study, the concentrations of missing nutrient
species were first predicted by the intermediate ML method and then used as inputs for another ML
method for final predictions. The pre-generation of missing data and pre-modelling hydrological
analysis were critical components of the hybrid model and allowed the identification of the impact of
different hydrological transport pathways for TN export from the two tributary catchments. The
hybrid ML methods were further applied to generate nutrient data for eight tributaries, and the
generated data have since been used as inputs to an estuary prediction model, which simulates and
forecasts nutrient concentrations in the previous and next 5 d in the Swan–Canning Estuary
(Huang et al., 2019). The modelling methods and strategies developed in the work presented here can
be easily applied to other study areas. Overall, ML methods provide a flexible and feasible solution
to explore the underlying relationships, reconstruct spatial and temporal datasets, and combine
different models.</p>
</sec>
</sec>
<sec id="Ch1.S6" sec-type="conclusions">
  <label>6</label><title>Summary and conclusion</title>
      <p id="d1e3401">A hybrid machine learning model was developed, and its performance tested on six nutrients and two
estuary tributaries and compared with alternative modelling approaches. The hybrid ML model
exhibited higher prediction accuracy and lower prediction uncertainty than stand-alone ML, WRTDS and LM for almost all nutrients. The pre-generation of missing data and pre-modelling hydrological
analysis were critical components of the hybrid model and allowed the identification of the impact of
different hydrological transport pathways for TN export from the two tributary catchments. The
results of this study demonstrate the advantages of using hybrid models for high temporal resolution
nutrient prediction; the results also demonstrate the use of the hybrid model for re-analysis of
historical data in the light of contemporary<?pagebreak page4267?> data. Modelling strategies for different modelling
targets and dataset structures have also been discussed. The modelling framework presented here can
aid others to fully use all available nutrient data to generate accurate nutrient predictions.</p>
</sec>

      
      </body>
    <back><notes notes-type="codedataavailability"><title>Code and data availability</title>

      <p id="d1e3408">The data and the data sources used in this study are cited and explained in
the text. The current version of model is available from the project
website: <uri>https://github.com/benyawang-uwa/daily-nutrient-prediction</uri> (last access: 9 September 2020) under the MIT
licence. The exact version of the model used to produce the results used in
this paper is archived on Zenodo (<ext-link xlink:href="https://doi.org/10.5281/zenodo.3739611" ext-link-type="DOI">10.5281/zenodo.3739611</ext-link>, Wang, 2020).</p>
  </notes><app-group>
        <supplementary-material position="anchor"><p id="d1e3417">The supplement related to this article is available online at: <inline-supplementary-material xlink:href="https://doi.org/10.5194/gmd-13-4253-2020-supplement" xlink:title="pdf">https://doi.org/10.5194/gmd-13-4253-2020-supplement</inline-supplementary-material>.</p></supplementary-material>
        </app-group><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d1e3426">BW, MRH and CO contributed to the
development of the methodology and designed the experiments, and BW
carried them out. BW developed the model code and performed the
simulations. BW prepared the paper with contributions from all
coauthors.</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d1e3432">The authors declare that they have no conflict of interest.</p>
  </notes><ack><title>Acknowledgements</title><p id="d1e3438">The authors acknowledge Peisheng Huang and Brendan Busch for providing the historical nutrient data.</p></ack><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d1e3443">Benya Wang was supported by a postgraduate scholarship provided by the CRC for Water Sensitive Cities. Matthew R. Hipsey received funding support from the Australian Research Council (project LP150100451).</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d1e3449">This paper was edited by Thomas Poulet and reviewed by Thu Huong Thi Hoang and one anonymous referee.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bib1"><label>1</label><?label 1?><mixed-citation>Abbott, B. W., Baranov, V., Mendoza-Lera, C., Nikolakopoulou, M., Harjung, A., Kolbe,
T., Balasubramanian, M. N., Vaessen, T. N., Ciocca, F., Campeau, A., Wallin, M. B., Romeijn, P.,
Antonelli, M., Gonçalves, J., Datry, T., Laverman, A. M., de Dreuzy, J. R., Hannah, D. M.,
Krause, S., Oldham, C., and Pinay, G.: Using multi-tracer inference to move beyond
single-catchment ecohydrology, Earth-Science Rev., 160, 19–42,
<ext-link xlink:href="https://doi.org/10.1016/j.earscirev.2016.06.014" ext-link-type="DOI">10.1016/j.earscirev.2016.06.014</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib2"><label>2</label><?label 2?><mixed-citation>Adams, R., Arafat, Y., Eate, V., Grace, M. R., Saffarpour, S., Weatherley, A. J., and
Western, A. W.: A catchment study of sources and sinks of nutrients and sediments in south-east
Australia, J. Hydrol., 515, 166–179, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2014.04.034" ext-link-type="DOI">10.1016/j.jhydrol.2014.04.034</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib3"><label>3</label><?label 3?><mixed-citation>Álvarez-Cabria, M., Barquín, J., and Peñas, F. J.: Modelling the spatial
and seasonal variability of water quality for entire river networks: Relationships with natural
and anthropogenic factors, Sci. Total Environ., 545–546, 152–162,
<ext-link xlink:href="https://doi.org/10.1016/j.scitotenv.2015.12.109" ext-link-type="DOI">10.1016/j.scitotenv.2015.12.109</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib4"><label>4</label><?label 4?><mixed-citation>Barron, O., Donn, M., Furby, S., Chia, J., and Johnstone, C.: Groundwater contribution to
nutrient export from the Ellen Brook catchment, available at:
<uri>http://www.clw.csiro.au/publications/waterforahealthycountry/2009/wfhc-groundwater-Ellen-Brook-catchment.pdf</uri> (last access: 9 September 2020), 2009.</mixed-citation></ref>
      <ref id="bib1.bib5"><label>5</label><?label 5?><mixed-citation>Belgiu, M. and Drăgu, L.: Random forest in remote sensing: A review of applications
and future directions, ISPRS J. Photogramm. Remote Sens., 114, 24–31,
<ext-link xlink:href="https://doi.org/10.1016/j.isprsjprs.2016.01.011" ext-link-type="DOI">10.1016/j.isprsjprs.2016.01.011</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib6"><label>6</label><?label 6?><mixed-citation>Bernal, S., Butturini, A., and Sabater, F.: Seasonal variations of dissolved nitrogen and
DOC : DON ratios in an intermittent Mediterranean stream, Biogeochemistry, 75, 351–372,
<ext-link xlink:href="https://doi.org/10.1007/s10533-005-1246-7" ext-link-type="DOI">10.1007/s10533-005-1246-7</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bib7"><label>7</label><?label 7?><mixed-citation>Bourke, S., Hammond, M., and Clohessy, S.: Perth Shallow Groundwater Systems Investigation:
North Lake, available at:
<uri>https://www.water.wa.gov.au/__data/assets/pdf_file/0016/7432/108960.pdf</uri> (last access: 9 September 2020), 2015.</mixed-citation></ref>
      <ref id="bib1.bib8"><label>8</label><?label 8?><mixed-citation> Breiman, L.: Random forests, Mach. Learn., 45, 5–32, 2001.</mixed-citation></ref>
      <ref id="bib1.bib9"><label>9</label><?label 9?><mixed-citation> Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A.: Classification and regression
trees, CRC Press, Boca Raton, 1984.</mixed-citation></ref>
      <ref id="bib1.bib10"><label>10</label><?label 1?><mixed-citation>
Brodie, R. and Hostetler, S.: A review of techniques for analysing baseflow from stream hydrographs, in: Proceedings of the NZHS-IAHNZSSS 2005 Conference, Auckland, New Zealand, 2005.</mixed-citation></ref>
      <ref id="bib1.bib11"><label>11</label><?label 10?><mixed-citation>Burt, T. P. and Pinay, G.: Linking hydrology and biogeochemistry, Prog.  Phys. Geogr.,
3, 297–316, <ext-link xlink:href="https://doi.org/10.1067/mva.2002.123763" ext-link-type="DOI">10.1067/mva.2002.123763</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bib12"><label>12</label><?label 11?><mixed-citation>Chanat, J. G., Rice, K. C., and Hornberger, G. M.: Consistency of patterns in
concentration-discharge plots, Water Resour. Res., 38, 10–22, <ext-link xlink:href="https://doi.org/10.1029/2001WR000971" ext-link-type="DOI">10.1029/2001WR000971</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bib13"><label>13</label><?label 12?><mixed-citation>Chen, Y., Liu, R., Sun, C., Zhang, P., Feng, C., and Shen, Z.: Spatial and temporal
variations in nitrogen and phosphorous nutrients in the Yangtze River Estuary, Mar. Pollut. Bull.,
64, 2083–2089, <ext-link xlink:href="https://doi.org/10.1016/j.marpolbul.2012.07.020" ext-link-type="DOI">10.1016/j.marpolbul.2012.07.020</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib14"><label>14</label><?label 13?><mixed-citation>Clapcott, J. E., Collier, K. J., Death, R. G., Goodwin, E. O., Harding, J.  S., Kelly,
D., Leathwick, J. R., and Young, R. G.: Quantifying relationships between land-use gradients and
structural and functional indicators of stream ecological integrity, Freshw. Biol., 57, 74–90,
<ext-link xlink:href="https://doi.org/10.1111/j.1365-2427.2011.02696.x" ext-link-type="DOI">10.1111/j.1365-2427.2011.02696.x</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib15"><label>15</label><?label 14?><mixed-citation>Cohn, T. A., Delong, L. L., Gilroy, E. J., Hirsch, R. M., and Wells, D. K.: Estimating
constituent loads, Water Resour. Res., 25, 937–942, <ext-link xlink:href="https://doi.org/10.1029/WR025i005p00937" ext-link-type="DOI">10.1029/WR025i005p00937</ext-link>, 1989.</mixed-citation></ref>
      <ref id="bib1.bib16"><label>16</label><?label 15?><mixed-citation>Conroy, E., Turner, J. N., Rymszewicz, A., O'Sullivan, J. J., Bruen, M., Lawler, D.,
Lally, H., and Kelly-Quinn, M.: The impact of cattle access on ecological water quality in
streams: Examples from agricultural catchments within Ireland, Sci. Total Environ., 547, 17–29,
<ext-link xlink:href="https://doi.org/10.1016/j.scitotenv.2015.12.120" ext-link-type="DOI">10.1016/j.scitotenv.2015.12.120</ext-link>, 2016.</mixed-citation></ref>
      <?pagebreak page4268?><ref id="bib1.bib17"><label>17</label><?label 16?><mixed-citation>Coopersmith, E. J., Minsker, B., and Montagna, P.: Understanding and forecasting
hypoxia using machine learning algorithms, J. Hydroinformatics, 13, 64,
<ext-link xlink:href="https://doi.org/10.2166/hydro.2010.015" ext-link-type="DOI">10.2166/hydro.2010.015</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib18"><label>18</label><?label 17?><mixed-citation>Coops, N. C., Waring, R. H., Beier, C., Roy-Jauvin, R., and Wang, T.: Modeling the
occurrence of 15 coniferous tree species throughout the Pacific Northwest of North America using a
hybrid approach of a generic process-based growth model and decision tree analysis,
Appl. Veg. Sci., 14, 402–414, <ext-link xlink:href="https://doi.org/10.1111/j.1654-109X.2011.01125.x" ext-link-type="DOI">10.1111/j.1654-109X.2011.01125.x</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bib19"><label>19</label><?label 18?><mixed-citation>Cozzi, S. and Giani, M.: River water and nutrient discharges in the Northern Adriatic
Sea: Current importance and long term changes, Cont. Shelf Res., 31, 1881–1893,
<ext-link xlink:href="https://doi.org/10.1016/j.csr.2011.08.010" ext-link-type="DOI">10.1016/j.csr.2011.08.010</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bib20"><label>20</label><?label 19?><mixed-citation>Crowder, D. W., Demissie, M., and Markus, M.: The accuracy of sediment loads when
log-transformation produces nonlinear sediment load-discharge relationships, J. Hydrol., 336,
250–268, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2006.12.024" ext-link-type="DOI">10.1016/j.jhydrol.2006.12.024</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bib21"><label>21</label><?label 20?><mixed-citation>Cutler, D. R., Edwards, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., and
Lawler, J. J.: Random forests for classification in ecology, Ecology, 88, 2783–2792,
<ext-link xlink:href="https://doi.org/10.1890/07-0539.1" ext-link-type="DOI">10.1890/07-0539.1</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bib22"><label>22</label><?label 21?><mixed-citation>Davies-Colley, R. J., Nagels, J. W., Smith, R. A., Young, R. G., and Phillips, C. J.:
Water quality impact of a dairy cow herd crossing a stream, New Zeal. J. Mar. Freshw. Res., 38,
569–576, <ext-link xlink:href="https://doi.org/10.1080/00288330.2004.9517262" ext-link-type="DOI">10.1080/00288330.2004.9517262</ext-link>, 2004.</mixed-citation></ref>
      <ref id="bib1.bib23"><label>23</label><?label 22?><mixed-citation>Denvil-Sommer, A., Gehlen, M., Vrac, M., and Mejia, C.: LSCE-FFNN-v1: a two-step neural
network model for the reconstruction of surface ocean <inline-formula><mml:math id="M158" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula><inline-formula><mml:math id="M159" display="inline"><mml:mrow class="chem"><mml:msub><mml:mi mathvariant="normal">CO</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> over the global ocean,
Geosci. Model Dev., 12, 2091–2105, <ext-link xlink:href="https://doi.org/10.5194/gmd-12-2091-2019" ext-link-type="DOI">10.5194/gmd-12-2091-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib24"><label>24</label><?label 23?><mixed-citation>Domingues, R. B., Anselmo, T. P., Barbosa, A. B., Sommer, U., and Galvão, H. M.:
Nutrient limitation of phytoplankton growth in the freshwater tidal zone of a turbid,
Mediterranean estuary, Estuar. Coast. Shelf Sci., 91, 282–297, <ext-link xlink:href="https://doi.org/10.1016/j.ecss.2010.10.033" ext-link-type="DOI">10.1016/j.ecss.2010.10.033</ext-link>,
2011.</mixed-citation></ref>
      <ref id="bib1.bib25"><label>25</label><?label 24?><mixed-citation>Erdal, H. I. and Karakurt, O.: Advancing monthly streamflow prediction accuracy of CART
models using ensemble learning paradigms, J. Hydrol., 477, 119–128,
<ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2012.11.015" ext-link-type="DOI">10.1016/j.jhydrol.2012.11.015</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bib26"><label>26</label><?label 25?><mixed-citation>Erlandsson, M., Cory, N., Fölster, J., Köhler, S., Laudon, H., Weyhenmeyer,
G. A., and Bishop, K.: Increasing Dissolved Organic Carbon Redefines the Extent of Surface Water
Acidification and Helps Resolve a Classic Controversy, Bioscience, 61, 614–618,
<ext-link xlink:href="https://doi.org/10.1525/bio.2011.61.8.7" ext-link-type="DOI">10.1525/bio.2011.61.8.7</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bib27"><label>27</label><?label 26?><mixed-citation>Filep, T. and Rékási, M.: Factors controlling dissolved organic carbon (DOC),
dissolved organic nitrogen (DON) and DOC/DON ratio in arable soils based on a dataset from
Hungary, Geoderma, 162, 312–318, <ext-link xlink:href="https://doi.org/10.1016/j.geoderma.2011.03.002" ext-link-type="DOI">10.1016/j.geoderma.2011.03.002</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bib28"><label>28</label><?label 27?><mixed-citation>Forio, M. A. E., Landuyt, D., Bennetsen, E., Lock, K., Nguyen, T. H. T., Ambarita,
M. N. D., Musonge, P. L. S., Boets, P., Everaert, G., Dominguez-Granda, L., and Goethals,
P. L. M.: Bayesian belief network models to analyse and predict ecological water quality in
rivers, Ecol. Modell., 312, 222–238, <ext-link xlink:href="https://doi.org/10.1016/j.ecolmodel.2015.05.025" ext-link-type="DOI">10.1016/j.ecolmodel.2015.05.025</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib29"><label>29</label><?label 28?><mixed-citation>Friedman, J.: Greedy Function Approximation: A Gradient Boosting Machine, Ann. Stat.,
29, 1189–1232, <ext-link xlink:href="https://doi.org/10.1214/009053606000000795" ext-link-type="DOI">10.1214/009053606000000795</ext-link>, 2001.</mixed-citation></ref>
      <ref id="bib1.bib30"><label>30</label><?label 29?><mixed-citation>Friedman, J. H.: Stochastic gradient boosting, Comput. Stat. Data Anal., 38, 367–378,
<ext-link xlink:href="https://doi.org/10.1016/S0167-9473(01)00065-2" ext-link-type="DOI">10.1016/S0167-9473(01)00065-2</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bib31"><label>31</label><?label 30?><mixed-citation>Fuka, D., Walter, T., Archibald, J., Tammo, S., and Easton, Z.: EcoHydRology: A Community Modeling Foundation for Eco-Hydrology, R package version 0.4.12.1, available at: <uri>https://cran.r-project.org/web/packages/EcoHydRology</uri> (last access: 9 September 2020), 2018.</mixed-citation></ref>
      <ref id="bib1.bib32"><label>32</label><?label 31?><mixed-citation>Furey, P. R. and Gupta, V. K.: A physically based filter for separating base flow from
streamflow time series, Water Resour. Res., 37, 2709–2722, <ext-link xlink:href="https://doi.org/10.1029/2001WR000243" ext-link-type="DOI">10.1029/2001WR000243</ext-link>, 2001.</mixed-citation></ref>
      <ref id="bib1.bib33"><label>33</label><?label 32?><mixed-citation>Giblin, A. E., Weston, N. B., Banta, G. T., Tucker, J., and Hopkinson, C. S.: The
Effects of Salinity on Nitrogen Losses from an Oligohaline Estuarine Sediment, Estuar. Coast., 33,
1054–1068, <ext-link xlink:href="https://doi.org/10.1007/s12237-010-9280-7" ext-link-type="DOI">10.1007/s12237-010-9280-7</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib34"><label>34</label><?label 33?><mixed-citation>Górniak, A., Zieliński, P., Jekatierynczuk-Rudczyk, E., Grabowska, M. and
Suchowolec, T.: The role of dissolved organic carbon in a shallow lowland reservoir ecosystem – A
long-term study, Acta Hydrochim. Hydrobiol., 30, 179–189, <ext-link xlink:href="https://doi.org/10.1002/aheh.200390001" ext-link-type="DOI">10.1002/aheh.200390001</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bib35"><label>35</label><?label 34?><mixed-citation>Green, C. T., Bekins, B. a, Kalkhoff, S. J., Hirsch, R. M., Liao, L., and Barnes, K. K.:
Decadal surface water quality trends under variable climate, land use, and hydrogeochemical
setting in Iowa, USA, Water Resour. Res., 50, 2425–2443, <ext-link xlink:href="https://doi.org/10.1002/2013WR014829" ext-link-type="DOI">10.1002/2013WR014829</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib36"><label>36</label><?label 35?><mixed-citation>Greening, H., Janicki, A., Sherwood, E. T., Pribble, R., and Johansson, J. O.  R.:
Ecosystem responses to long-term nutrient management in an urban estuary: Tampa Bay, Florida, USA,
Estuar. Coast. Shelf Sci., 151, A1–A16, <ext-link xlink:href="https://doi.org/10.1016/j.ecss.2014.10.003" ext-link-type="DOI">10.1016/j.ecss.2014.10.003</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib37"><label>37</label><?label 36?><mixed-citation>Gunaratne, G. L., Vogwill, R. I. J., and Hipsey, M. R.: Effect of seasonal flushing on
nutrient export characteristics of an urbanizing, remote, ungauged coastal catchment,
Hydrol. Sci. J., 62, 800–817, <ext-link xlink:href="https://doi.org/10.1080/02626667.2016.1264585" ext-link-type="DOI">10.1080/02626667.2016.1264585</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib38"><label>38</label><?label 37?><mixed-citation>Guo, D., Lintern, A., Webb, J. A., Ryu, D., Liu, S., Bende-Michl, U., Leahy, P.,
Wilson, P., and Western, A. W.: Key Factors Affecting Temporal Variability in Stream Water
Quality, Water Resour. Res., 55, 112–129, <ext-link xlink:href="https://doi.org/10.1029/2018WR023370" ext-link-type="DOI">10.1029/2018WR023370</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib39"><label>39</label><?label 38?><mixed-citation>Halliday, S. J., Wade, A. J., Skeffington, R. A., Neal, C., Reynolds, B., Rowland, P.,
Neal, M., and Norris, D.: An analysis of long-term trends, seasonality and short-term dynamics in
water quality data from Plynlimon, Wales, Sci. Total Environ., 434, 186–200,
<ext-link xlink:href="https://doi.org/10.1016/j.scitotenv.2011.10.052" ext-link-type="DOI">10.1016/j.scitotenv.2011.10.052</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib40"><label>40</label><?label 39?><mixed-citation>Heathwaite, A. L.: Multiple stressors on water availability at global to catchment
scales: Understanding human impact on nutrient cycles to protect water quality and water
availability in the long term, Freshw. Biol., 55, 241–257,
<ext-link xlink:href="https://doi.org/10.1111/j.1365-2427.2009.02368.x" ext-link-type="DOI">10.1111/j.1365-2427.2009.02368.x</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib41"><label>41</label><?label 40?><mixed-citation>Herndon, E. M., Dere, A. L., Sullivan, P. L., Norris, D., Reynolds, B., and Brantley,
S. L.: Landscape heterogeneity drives contrasting concentration–discharge relationships in shale
headwater catchments, Hydrol. Earth Syst. Sci., 19, 3333–3347, <ext-link xlink:href="https://doi.org/10.5194/hess-19-3333-2015" ext-link-type="DOI">10.5194/hess-19-3333-2015</ext-link>,
2015.</mixed-citation></ref>
      <ref id="bib1.bib42"><label>42</label><?label 41?><mixed-citation>Hirsch, R. M. and De Cicco, L.: User guide to Exploration and Graphics for RivEr Trends
(EGRET) and dataRetrieval: R packages for hydrologic data, Tech. Methods B, 4, 93,
<ext-link xlink:href="https://doi.org/10.3133/tm4A10" ext-link-type="DOI">10.3133/tm4A10</ext-link>, 2015.</mixed-citation></ref>
      <?pagebreak page4269?><ref id="bib1.bib43"><label>43</label><?label 42?><mixed-citation>Hirsch, R. M., Moyer, D. L., and Archfield, S. A.: Weighted regressions on time,
discharge, and season (WRTDS), with an application to chesapeake bay river inputs, J. Am. Water
Resour. Assoc., 46, 857–880, <ext-link xlink:href="https://doi.org/10.1111/j.1752-1688.2010.00482.x" ext-link-type="DOI">10.1111/j.1752-1688.2010.00482.x</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib44"><label>44</label><?label 43?><mixed-citation>Holguin-Gonzalez, J. E., Everaert, G., Boets, P., Galvis, A., and Goethals, P. L. M.:
Development and application of an integrated ecological modelling framework to analyze the impact
of wastewater discharges on the ecological water quality of rivers, Environ. Model. Softw., 48,
27–36, <ext-link xlink:href="https://doi.org/10.1016/j.envsoft.2013.06.004" ext-link-type="DOI">10.1016/j.envsoft.2013.06.004</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bib45"><label>45</label><?label 44?><mixed-citation>Huang, P., Trayler, K., Wang, B., Saeed, A., Oldham, C., Busch, B., and Hipsey, M.: An
integrated modelling system for water quality forecasting in an urban eutrophic estuary: The
Swan-Canning Estuary virtual observatory, J.  Mar. Syst., 199, 103218,
<ext-link xlink:href="https://doi.org/10.1016/j.jmarsys.2019.103218" ext-link-type="DOI">10.1016/j.jmarsys.2019.103218</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib46"><label>46</label><?label 45?><mixed-citation>Hunter, J. M., Maier, H. R., Gibbs, M. S., Foale, E. R., Grosvenor, N. A., Harders,
N. P., and Kikuchi-Miller, T. C.: Framework for developing hybrid process-driven, artificial
neural network and regression models for salinity prediction in river systems, Hydrol. Earth
Syst. Sci., 22, 2987–3006, <ext-link xlink:href="https://doi.org/10.5194/hess-22-2987-2018" ext-link-type="DOI">10.5194/hess-22-2987-2018</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib47"><label>47</label><?label 46?><mixed-citation>Ishwaran, H. and Kogalur, U. B.: Consistency of random survival forests,
Stat. Probab. Lett., 80, 1056–1064, <ext-link xlink:href="https://doi.org/10.1016/j.spl.2010.02.020" ext-link-type="DOI">10.1016/j.spl.2010.02.020</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib48"><label>48</label><?label 47?><mixed-citation>Ismail, R. and Mutanga, O.: A comparison of regression tree ensembles: Predicting Sirex
noctilio induced water stress in Pinus patula forests of KwaZulu-Natal, South Africa,
Int. J. Appl. Earth Obs. Geoinf., 12, S45–S51, <ext-link xlink:href="https://doi.org/10.1016/j.jag.2009.09.004" ext-link-type="DOI">10.1016/j.jag.2009.09.004</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib49"><label>49</label><?label 48?><mixed-citation>Jickells, T. D., Andrews, J. E., Parkes, D. J., Suratman, S., Aziz, A. A., and Hee,
Y. Y.: Nutrient transport through estuaries: The importance of the estuarine geography,
Estuar. Coast. Shelf Sci., 150, 215–229, <ext-link xlink:href="https://doi.org/10.1016/j.ecss.2014.03.014" ext-link-type="DOI">10.1016/j.ecss.2014.03.014</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib50"><label>50</label><?label 49?><mixed-citation>Jordan, P. and Cassidy, R.: Technical Note: Assessing a 24/7 solution for monitoring
water quality loads in small river catchments, Hydrol. Earth Syst. Sci., 15, 3093–3100,
<ext-link xlink:href="https://doi.org/10.5194/hess-15-3093-2011" ext-link-type="DOI">10.5194/hess-15-3093-2011</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bib51"><label>51</label><?label 50?><mixed-citation>Kaiser, D., Unger, D., Qiu, G., Zhou, H., and Gan, H.: Natural and human influences on
nutrient transport through a small subtropical Chinese estuary, Sci. Total Environ., 450–451,
92–107, <ext-link xlink:href="https://doi.org/10.1016/j.scitotenv.2013.01.096" ext-link-type="DOI">10.1016/j.scitotenv.2013.01.096</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bib52"><label>52</label><?label 51?><mixed-citation> Kelsey, P., Hall, J., Kitsios, A., Quinton, B., and Shakya, D.: Hydrological and
nutrient modelling of the Swan-Canning coastal catchments, Water Science technical series,
Department of Water, Western Australia., 2010.</mixed-citation></ref>
      <ref id="bib1.bib53"><label>53</label><?label 52?><mixed-citation> Kelsey, P., Hall, J., Kretschmer, P., Quiton, B., and Shakya, D.: Hydrological and
nutrient modelling of the Peel-Harvey catchment, Water Science Technical Series, Department of
Water, Western Australia., 2011.</mixed-citation></ref>
      <ref id="bib1.bib54"><label>54</label><?label 53?><mixed-citation>Lamsal, S., Grunwald, S., Bruland, G. L., Bliss, C. M., and Comerford, N. B.: Regional
hybrid geospatial modeling of soil nitrate-nitrogen in the Santa Fe River Watershed, Geoderma,
135, 233–247, <ext-link xlink:href="https://doi.org/10.1016/j.geoderma.2005.12.009" ext-link-type="DOI">10.1016/j.geoderma.2005.12.009</ext-link>, 2006.</mixed-citation></ref>
      <ref id="bib1.bib55"><label>55</label><?label 54?><mixed-citation>Li, J.: Assessing spatial predictive models in the environmental sciences: Accuracy
measures, data variation and variance explained, Environ. Model.  Softw., 80, 1–8,
<ext-link xlink:href="https://doi.org/10.1016/j.envsoft.2016.02.004" ext-link-type="DOI">10.1016/j.envsoft.2016.02.004</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib56"><label>56</label><?label 55?><mixed-citation>Li, M., Xu, K., Watanabe, M., and Chen, Z.: Long-term variations in dissolved silicate,
nitrogen, and phosphorus flux from the Yangtze River into the East China Sea and impacts on
estuarine ecosystem, Estuar. Coast. Shelf Sci., 71, 3–12, <ext-link xlink:href="https://doi.org/10.1016/j.ecss.2006.08.013" ext-link-type="DOI">10.1016/j.ecss.2006.08.013</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bib57"><label>57</label><?label 56?><mixed-citation>Li, M., Lee, Y. J., Testa, J. M., Li, Y., Ni, W., Kemp, W. M., and Di Toro, D. M.: What
drives interannual variability of hypoxia in Chesapeake Bay: Climate forcing versus nutrient
loading?, Geophys. Res. Lett., 43, 2127–2134, <ext-link xlink:href="https://doi.org/10.1002/2015GL067334" ext-link-type="DOI">10.1002/2015GL067334</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib58"><label>58</label><?label 57?><mixed-citation>Li, R., Liu, S., Zhang, G., Ren, J., and Zhang, J.: Biogeochemistry of nutrients in an
estuary affected by human activities: The Wanquan River estuary, eastern Hainan Island, China,
Cont. Shelf Res., 57, 18–31, <ext-link xlink:href="https://doi.org/10.1016/j.csr.2012.02.013" ext-link-type="DOI">10.1016/j.csr.2012.02.013</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bib59"><label>59</label><?label 58?><mixed-citation>Lintern, A., Webb, J. A., Ryu, D., Liu, S., Waters, D., Leahy, P., Bende-Michl, U., and
Western, A. W.: What are the key catchment characteristics affecting spatial differences in
riverine water quality?, Water Resour. Res., 54, 7252–7272, <ext-link xlink:href="https://doi.org/10.1029/2017WR022172" ext-link-type="DOI">10.1029/2017WR022172</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib60"><label>60</label><?label 59?><mixed-citation>Liu, S. M., Li, L. W., Zhang, G. L., Liu, Z., Yu, Z., and Ren, J. L.: Impacts of human
activities on nutrient transports in the Huanghe (Yellow River) estuary, J. Hydrol., 430–431,
103–110, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2012.02.005" ext-link-type="DOI">10.1016/j.jhydrol.2012.02.005</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib61"><label>61</label><?label 60?><mixed-citation>Lloyd, C. E. M., Freer, J. E., Collins, A. L., Johnes, P. J., and Jones, J.  I.:
Methods for detecting change in hydrochemical time series in response to targeted pollutant
mitigation in river catchments, J. Hydrol., 514, 297–312, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2014.04.036" ext-link-type="DOI">10.1016/j.jhydrol.2014.04.036</ext-link>,
2014.</mixed-citation></ref>
      <ref id="bib1.bib62"><label>62</label><?label 61?><mixed-citation> Lyne, V. and Hollick, M.: Stochastic time-variable
rainfall-runoff modelling, Institution of Engineers, Canberra, Australia, p. 89–93, 1979.</mixed-citation></ref>
      <ref id="bib1.bib63"><label>63</label><?label 62?><mixed-citation> Maier, H. R., Kapelan, Z., Kasprzyk, J., Kollat, J., Matott, L. S., Cunha, M. C.,
Dandy, G. C., Gibbs, M. S., Keedwell, E., Marchi, A., Ostfeld, A., Savic, D., Solomatine, D. P.,
Vrugt, J. A., Zecchin, A. C., Minsker, B. S., Barbour, E. J., Kuczera, G., Pasha, F., Castelletti,
A., Giuliani, M., and Reed, P. M.: Evolutionary algorithms and other metaheuristics in water
resources: current status, research challenges and future directions, Environ. Model. Softw., 62,
271–299, 2014.</mixed-citation></ref>
      <ref id="bib1.bib64"><label>64</label><?label 63?><mixed-citation>Makler-Pick, V., Gal, G., Gorfine, M., Hipsey, M. R., and Carmel, Y.: Sensitivity
analysis for complex ecological models – A new approach, Environ. Model. Softw., 26, 124–134,
<ext-link xlink:href="https://doi.org/10.1016/j.envsoft.2010.06.010" ext-link-type="DOI">10.1016/j.envsoft.2010.06.010</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bib65"><label>65</label><?label 64?><mixed-citation>Martínez-Rojas, M., Marín, N., and Vila, M. A.: The role of information
technologies to address data handling in construction project management, J. Comput. Civ. Eng.,
30, 1–10, <ext-link xlink:href="https://doi.org/10.1061/(ASCE)CP.1943-5487.0000538" ext-link-type="DOI">10.1061/(ASCE)CP.1943-5487.0000538</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib66"><label>66</label><?label 65?><mixed-citation>McBratney, A. B., Odeh, I. O. A., Bishop, T. F. A., Dunbar, M. S., and Shatar, T. M.:
An overview of pedometric techniques for use in soil survey, Geoderma, 97, 293–327,
<ext-link xlink:href="https://doi.org/10.1016/S0016-7061(00)00043-4" ext-link-type="DOI">10.1016/S0016-7061(00)00043-4</ext-link>, 2000.</mixed-citation></ref>
      <ref id="bib1.bib67"><label>67</label><?label 66?><mixed-citation>Mellander, P. E., Melland, A. R., Jordan, P., Wall, D. P., Murphy, P. N. C., and
Shortle, G.: Quantifying nutrient transfer pathways in agricultural catchments using high temporal
resolution data, Environ. Sci. Policy, 24, 44–57, <ext-link xlink:href="https://doi.org/10.1016/j.envsci.2012.06.004" ext-link-type="DOI">10.1016/j.envsci.2012.06.004</ext-link>, 2012.</mixed-citation></ref>
      <?pagebreak page4270?><ref id="bib1.bib68"><label>68</label><?label 67?><mixed-citation>Meshgi, A., Schmitter, P., Chui, T. F. M., and Babovic, V.: Development of a modular
streamflow model to quantify runoff contributions from different land uses in tropical urban
environments using Genetic Programming, J.  Hydrol., 525, 711–723,
<ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2015.04.032" ext-link-type="DOI">10.1016/j.jhydrol.2015.04.032</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib69"><label>69</label><?label 68?><mixed-citation>Meybeck, M. and Moatar, F.: Daily variability of river concentrations and fluxes:
Indicators based on the segmentation of the rating curve, Hydrol.  Process., 26, 1188–1207,
<ext-link xlink:href="https://doi.org/10.1002/hyp.8211" ext-link-type="DOI">10.1002/hyp.8211</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib70"><label>70</label><?label 69?><mixed-citation>Moatar, F., Abbott, B. W., Minaudo, C., Curie, F., and Pinay, G.: Elemental properties,
hydrology, and biology interact to shape concentration-discharge curves for carbon, nutrients,
sediment, and major ions, Water Resour. Res., 53, 1270–1287, <ext-link xlink:href="https://doi.org/10.1002/2016WR019635" ext-link-type="DOI">10.1002/2016WR019635</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib71"><label>71</label><?label 70?><mixed-citation>Nathan, R. J. and McMahon, T. A.: Evaluation of automated techniques for base flow and
recession analyses, Water Resour. Res., 26, 1465–1473, <ext-link xlink:href="https://doi.org/10.1029/WR026i007p01465" ext-link-type="DOI">10.1029/WR026i007p01465</ext-link>,
1990.</mixed-citation></ref>
      <ref id="bib1.bib72"><label>72</label><?label 71?><mixed-citation> Nice, H., Foulsham, G., Bree, M., and Sarah, E.: A baseline study of contaminants in the sediments of the Swan and Canning
estuaries, Water Science technical series report no. 6, Department of Water, Western Australia, 2009.</mixed-citation></ref>
      <ref id="bib1.bib73"><label>73</label><?label 72?><mixed-citation>Noori, N. and Kalin, L.: Coupling SWAT and ANN models for enhanced daily streamflow
prediction, J. Hydrol., 533, 141–151, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2015.11.050" ext-link-type="DOI">10.1016/j.jhydrol.2015.11.050</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib74"><label>74</label><?label 74?><mixed-citation>Petrone, K. C.: Catchment export of carbon, nitrogen, and phosphorus across an
agro-urban land use gradient, Swan-Canning River system, southwestern Australia, J. Geophys. Res.,
115, G01016, <ext-link xlink:href="https://doi.org/10.1029/2009JG001051" ext-link-type="DOI">10.1029/2009JG001051</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib75"><label>75</label><?label 75?><mixed-citation> Petrone, K. C., Richards, J. S., and Grierson, P. F.: Bioavailability and composition
of dissolved organic carbon and nitrogen in a near coastal catchment of south-western Australia,
Biogeochemistry, 92, 27–40, 2009.</mixed-citation></ref>
      <ref id="bib1.bib76"><label>76</label><?label 76?><mixed-citation>Povak, N. A., Hessburg, P. F., McDonnell, T. C., Reynolds, K. M., Sullivan, T. J.,
Salter, R. B., and Cosby, B. J.: Machine learning and linear regression models to predict
catchment-level base cation weathering rates across the southern Appalachian Mountain region, USA,
Water Resour. Res., 50, 2798–2814, <ext-link xlink:href="https://doi.org/10.1002/2013WR014222" ext-link-type="DOI">10.1002/2013WR014222</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib77"><label>77</label><?label 77?><mixed-citation>Puissant, A., Rougier, S., and Stumpf, A.: Object-oriented mapping of urban trees using
Random Forest classifiers, Int. J. Appl. Earth Obs. Geoinf., 26, 235–245,
<ext-link xlink:href="https://doi.org/10.1016/j.jag.2013.07.002" ext-link-type="DOI">10.1016/j.jag.2013.07.002</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib78"><label>78</label><?label 79?><mixed-citation>Ruibal-Conti, A. L., Summers, R., Weaver, D., and Hipsey, M. R.: Hydro-climatological
non-stationarity shifts patterns of nutrient delivery to an estuarine system, Hydrol. Earth
Syst. Sci. Discuss., 10, 11035–11092, <ext-link xlink:href="https://doi.org/10.5194/hessd-10-11035-2013" ext-link-type="DOI">10.5194/hessd-10-11035-2013</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bib79"><label>79</label><?label 80?><mixed-citation> Schnoor, J. L.: 4.1. Water quality and its sustainability introduction. in: Comprehensive Water Quality and Purification, edited by: Ahuja, S., Elsiever, Waltham, pp. 1–40, 2014.</mixed-citation></ref>
      <ref id="bib1.bib80"><label>80</label><?label 81?><mixed-citation>Seitzinger, S. P., Sanders, R. W., and Styles, R.: Bioavailability of DON from natural
and anthropogenic sources to estuarine plankton, Limnol.  Oceanogr., 47, 353–366,
<ext-link xlink:href="https://doi.org/10.4319/lo.2002.47.2.0353" ext-link-type="DOI">10.4319/lo.2002.47.2.0353</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bib81"><label>81</label><?label 82?><mixed-citation>Singh, K. P., Gupta, S., and Mohan, D.: Evaluating influences of seasonal variations
and anthropogenic activities on alluvial groundwater hydrochemistry using ensemble learning
approaches, J. Hydrol., 511, 254–266, 2014.
 </mixed-citation></ref><?xmltex \hack{\newpage}?>
      <ref id="bib1.bib82"><label>82</label><?label 83?><mixed-citation>Staehr, P. A., Testa, J., and Carstensen, J.: Decadal Changes in Water Quality and Net
Productivity of a Shallow Danish Estuary Following Significant Nutrient Reductions,
Estuar. Coast., 40, 63–79, <ext-link xlink:href="https://doi.org/10.1007/s12237-016-0117-x" ext-link-type="DOI">10.1007/s12237-016-0117-x</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib83"><label>83</label><?label 84?><mixed-citation>Stallard, R. F. and Murphy, S. F.: A Unified Assessment of Hydrologic and
Biogeochemical Responses in Research Watersheds in Eastern Puerto Rico Using Runoff-Concentration
Relations, Aquat. Geochemistry, 20, 115–139, <ext-link xlink:href="https://doi.org/10.1007/s10498-013-9216-5" ext-link-type="DOI">10.1007/s10498-013-9216-5</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib84"><label>84</label><?label 85?><mixed-citation> Swan River Trust: Swan Canning Water Quality Improvement., 2009.</mixed-citation></ref>
      <ref id="bib1.bib85"><label>85</label><?label 86?><mixed-citation>Szilagyi, J. and Parlange, M. B.: Baseflow separation based on analytical solutions of
the Boussinesq equation, J. Hydrol., 204, 251–260, <ext-link xlink:href="https://doi.org/10.1016/S0022-1694(97)00132-7" ext-link-type="DOI">10.1016/S0022-1694(97)00132-7</ext-link>, 1998.</mixed-citation></ref>
      <ref id="bib1.bib86"><label>86</label><?label 87?><mixed-citation>Tao, Y., Wei, M., Ongley, E., Li, Z., and Jingsheng, C.: Long-term variations and
causal factors in nitrogen and phosphorus transport in the Yellow River, China,
Estuar. Coast. Shelf Sci., 86, 345–351, <ext-link xlink:href="https://doi.org/10.1016/j.ecss.2009.05.014" ext-link-type="DOI">10.1016/j.ecss.2009.05.014</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib87"><label>87</label><?label 88?><mixed-citation>Tesoriero, A. J., Duff, J. H., Wolock, D. M., Spahr, N. E., and Almendinger, J. E.:
Identifying Pathways and Processes Affecting Nitrate and Orthophosphate Inputs to Streams in
Agricultural Watersheds, J. Environ.  Qual., 38, 1892, <ext-link xlink:href="https://doi.org/10.2134/jeq2008.0484" ext-link-type="DOI">10.2134/jeq2008.0484</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bib88"><label>88</label><?label 89?><mixed-citation>Testa, J. M., Clark, J. B., Dennison, W. C., Donovan, E. C., Fisher, A. W., Ni, W.,
Parker, M., Scavia, D., Spitzer, S. E., Waldrop, A. M., Vargas, V.  M. D., and Ziegler, G.:
Ecological Forecasting and the Science of Hypoxia in Chesapeake Bay, Bioscience, 67, 614–626,
<ext-link xlink:href="https://doi.org/10.1093/biosci/bix048" ext-link-type="DOI">10.1093/biosci/bix048</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib89"><label>89</label><?label 1?><mixed-citation>Wang, B.: benyawang-uwa/daily-nutrient-prediction: first release of daily nutrient prediction model (Version v1.0.0), Zenodo, <ext-link xlink:href="https://doi.org/10.5281/zenodo.3739611" ext-link-type="DOI">10.5281/zenodo.3739611</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bib90"><label>90</label><?label 90?><mixed-citation>Wang, B., Hipsey, M. R., Ahmed, S., and Oldham, C.: The Impact of Landscape
Characteristics on Groundwater Dissolved Organic Nitrogen: Insights From Machine Learning Methods
and Sensitivity Analysis, Water Resour. Res., 54, 4785–4804, <ext-link xlink:href="https://doi.org/10.1029/2017WR021749" ext-link-type="DOI">10.1029/2017WR021749</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib91"><label>91</label><?label 91?><mixed-citation>Yang, P., Yang, Y. H., Zhou, B. B., and Zomaya, A. Y.: A Review of Ensemble Methods in
Bioinformatics, Curr. Bioinf., 5, 296–308, <ext-link xlink:href="https://doi.org/10.2174/157489310794072508" ext-link-type="DOI">10.2174/157489310794072508</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib92"><label>92</label><?label 92?><mixed-citation>Zhang, Q., Harman, C. J., and Ball, W. P.: An improved method for interpretation of
riverine concentration-discharge relationships indicates long-term shifts in reservoir sediment
trapping, Geophys. Res. Lett., 43, 10215–10224, <ext-link xlink:href="https://doi.org/10.1002/2016GL069945" ext-link-type="DOI">10.1002/2016GL069945</ext-link>, 2016a.</mixed-citation></ref>
      <ref id="bib1.bib93"><label>93</label><?label 93?><mixed-citation>Zhang, Q., Ball, W. P., and Moyer, D. L.: Decadal-scale export of nitrogen, phosphorus,
and sediment from the Susquehanna River basin, USA: Analysis and synthesis of temporal and spatial
patterns, Sci. Total Environ., 563–564, 1016–1029, <ext-link xlink:href="https://doi.org/10.1016/j.scitotenv.2016.03.104" ext-link-type="DOI">10.1016/j.scitotenv.2016.03.104</ext-link>, 2016b.</mixed-citation></ref>
      <ref id="bib1.bib94"><label>94</label><?label 94?><mixed-citation>Zhang, Q., Hirsch, R. M., and Ball, W. P.: Long-Term Changes in Sediment and Nutrient
Delivery from Conowingo Dam to Chesapeake Bay: Effects of Reservoir Sedimentation,
Environ. Sci. Technol., 50, 1877–1886, <ext-link xlink:href="https://doi.org/10.1021/acs.est.5b04073" ext-link-type="DOI">10.1021/acs.est.5b04073</ext-link>, 2016c.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>ML-SWAN-v1: a hybrid machine learning framework for the concentration prediction and discovery of transport pathways of surface water nutrients</article-title-html>
<abstract-html><p>Nutrient data from catchments discharging to receiving waters are monitored for catchment
management. However, nutrient data are often sparse in time and space and have non-linear
responses to environmental factors, making it difficult to systematically analyse long- and
short-term trends and undertake nutrient budgets. To address these challenges, we developed a
hybrid machine learning (ML) framework that first separated baseflow and quickflow from total
flow, generated data for missing nutrient species, and then utilised the pre-generated nutrient
data as additional variables in a final simulation of tributary water quality.  Hybrid random
forest (RF) and gradient boosting machine (GBM) models were employed and their performance
compared with a linear model, a multivariate weighted regression model, and stand-alone RF and GBM
models that did not pre-generate nutrient data. The six models were used to predict six different
nutrients discharged from two study sites in Western Australia: Ellen Brook (small and ephemeral)
and the Murray River (large and perennial). Our results showed that the hybrid RF and GBM models
had significantly higher accuracy and lower prediction uncertainty for almost all nutrient species
across the two sites. The pre-generated nutrient and hydrological data were highlighted as the
most important components of the hybrid model. The model results also indicated different
hydrological transport pathways for total nitrogen (TN) export from two tributary catchments. We demonstrated that
the hybrid model provides a flexible method to combine data of varied resolution and quality and
is accurate for the prediction of responses of surface water nutrient concentrations to hydrologic
variability.</p></abstract-html>
<ref-html id="bib1.bib1"><label>1</label><mixed-citation> Abbott, B. W., Baranov, V., Mendoza-Lera, C., Nikolakopoulou, M., Harjung, A., Kolbe,
T., Balasubramanian, M. N., Vaessen, T. N., Ciocca, F., Campeau, A., Wallin, M. B., Romeijn, P.,
Antonelli, M., Gonçalves, J., Datry, T., Laverman, A. M., de Dreuzy, J. R., Hannah, D. M.,
Krause, S., Oldham, C., and Pinay, G.: Using multi-tracer inference to move beyond
single-catchment ecohydrology, Earth-Science Rev., 160, 19–42,
<a href="https://doi.org/10.1016/j.earscirev.2016.06.014" target="_blank">https://doi.org/10.1016/j.earscirev.2016.06.014</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>2</label><mixed-citation> Adams, R., Arafat, Y., Eate, V., Grace, M. R., Saffarpour, S., Weatherley, A. J., and
Western, A. W.: A catchment study of sources and sinks of nutrients and sediments in south-east
Australia, J. Hydrol., 515, 166–179, <a href="https://doi.org/10.1016/j.jhydrol.2014.04.034" target="_blank">https://doi.org/10.1016/j.jhydrol.2014.04.034</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>3</label><mixed-citation> Álvarez-Cabria, M., Barquín, J., and Peñas, F. J.: Modelling the spatial
and seasonal variability of water quality for entire river networks: Relationships with natural
and anthropogenic factors, Sci. Total Environ., 545–546, 152–162,
<a href="https://doi.org/10.1016/j.scitotenv.2015.12.109" target="_blank">https://doi.org/10.1016/j.scitotenv.2015.12.109</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>4</label><mixed-citation> Barron, O., Donn, M., Furby, S., Chia, J., and Johnstone, C.: Groundwater contribution to
nutrient export from the Ellen Brook catchment, available at:
<a href="http://www.clw.csiro.au/publications/waterforahealthycountry/2009/wfhc-groundwater-Ellen-Brook-catchment.pdf" target="_blank"/> (last access: 9 September 2020), 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>5</label><mixed-citation> Belgiu, M. and Drăgu, L.: Random forest in remote sensing: A review of applications
and future directions, ISPRS J. Photogramm. Remote Sens., 114, 24–31,
<a href="https://doi.org/10.1016/j.isprsjprs.2016.01.011" target="_blank">https://doi.org/10.1016/j.isprsjprs.2016.01.011</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>6</label><mixed-citation> Bernal, S., Butturini, A., and Sabater, F.: Seasonal variations of dissolved nitrogen and
DOC&thinsp;:&thinsp;DON ratios in an intermittent Mediterranean stream, Biogeochemistry, 75, 351–372,
<a href="https://doi.org/10.1007/s10533-005-1246-7" target="_blank">https://doi.org/10.1007/s10533-005-1246-7</a>, 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>7</label><mixed-citation> Bourke, S., Hammond, M., and Clohessy, S.: Perth Shallow Groundwater Systems Investigation:
North Lake, available at:
<a href="https://www.water.wa.gov.au/__data/assets/pdf_file/0016/7432/108960.pdf" target="_blank"/> (last access: 9 September 2020), 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>8</label><mixed-citation> Breiman, L.: Random forests, Mach. Learn., 45, 5–32, 2001.
</mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>9</label><mixed-citation> Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A.: Classification and regression
trees, CRC Press, Boca Raton, 1984.
</mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>10</label><mixed-citation>
Brodie, R. and Hostetler, S.: A review of techniques for analysing baseflow from stream hydrographs, in: Proceedings of the NZHS-IAHNZSSS 2005 Conference, Auckland, New Zealand, 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>11</label><mixed-citation> Burt, T. P. and Pinay, G.: Linking hydrology and biogeochemistry, Prog.  Phys. Geogr.,
3, 297–316, <a href="https://doi.org/10.1067/mva.2002.123763" target="_blank">https://doi.org/10.1067/mva.2002.123763</a>, 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>12</label><mixed-citation> Chanat, J. G., Rice, K. C., and Hornberger, G. M.: Consistency of patterns in
concentration-discharge plots, Water Resour. Res., 38, 10–22, <a href="https://doi.org/10.1029/2001WR000971" target="_blank">https://doi.org/10.1029/2001WR000971</a>, 2002.
</mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>13</label><mixed-citation> Chen, Y., Liu, R., Sun, C., Zhang, P., Feng, C., and Shen, Z.: Spatial and temporal
variations in nitrogen and phosphorous nutrients in the Yangtze River Estuary, Mar. Pollut. Bull.,
64, 2083–2089, <a href="https://doi.org/10.1016/j.marpolbul.2012.07.020" target="_blank">https://doi.org/10.1016/j.marpolbul.2012.07.020</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>14</label><mixed-citation> Clapcott, J. E., Collier, K. J., Death, R. G., Goodwin, E. O., Harding, J.  S., Kelly,
D., Leathwick, J. R., and Young, R. G.: Quantifying relationships between land-use gradients and
structural and functional indicators of stream ecological integrity, Freshw. Biol., 57, 74–90,
<a href="https://doi.org/10.1111/j.1365-2427.2011.02696.x" target="_blank">https://doi.org/10.1111/j.1365-2427.2011.02696.x</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>15</label><mixed-citation> Cohn, T. A., Delong, L. L., Gilroy, E. J., Hirsch, R. M., and Wells, D. K.: Estimating
constituent loads, Water Resour. Res., 25, 937–942, <a href="https://doi.org/10.1029/WR025i005p00937" target="_blank">https://doi.org/10.1029/WR025i005p00937</a>, 1989.
</mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>16</label><mixed-citation> Conroy, E., Turner, J. N., Rymszewicz, A., O'Sullivan, J. J., Bruen, M., Lawler, D.,
Lally, H., and Kelly-Quinn, M.: The impact of cattle access on ecological water quality in
streams: Examples from agricultural catchments within Ireland, Sci. Total Environ., 547, 17–29,
<a href="https://doi.org/10.1016/j.scitotenv.2015.12.120" target="_blank">https://doi.org/10.1016/j.scitotenv.2015.12.120</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>17</label><mixed-citation> Coopersmith, E. J., Minsker, B., and Montagna, P.: Understanding and forecasting
hypoxia using machine learning algorithms, J. Hydroinformatics, 13, 64,
<a href="https://doi.org/10.2166/hydro.2010.015" target="_blank">https://doi.org/10.2166/hydro.2010.015</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>18</label><mixed-citation> Coops, N. C., Waring, R. H., Beier, C., Roy-Jauvin, R., and Wang, T.: Modeling the
occurrence of 15 coniferous tree species throughout the Pacific Northwest of North America using a
hybrid approach of a generic process-based growth model and decision tree analysis,
Appl. Veg. Sci., 14, 402–414, <a href="https://doi.org/10.1111/j.1654-109X.2011.01125.x" target="_blank">https://doi.org/10.1111/j.1654-109X.2011.01125.x</a>, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>19</label><mixed-citation> Cozzi, S. and Giani, M.: River water and nutrient discharges in the Northern Adriatic
Sea: Current importance and long term changes, Cont. Shelf Res., 31, 1881–1893,
<a href="https://doi.org/10.1016/j.csr.2011.08.010" target="_blank">https://doi.org/10.1016/j.csr.2011.08.010</a>, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>20</label><mixed-citation> Crowder, D. W., Demissie, M., and Markus, M.: The accuracy of sediment loads when
log-transformation produces nonlinear sediment load-discharge relationships, J. Hydrol., 336,
250–268, <a href="https://doi.org/10.1016/j.jhydrol.2006.12.024" target="_blank">https://doi.org/10.1016/j.jhydrol.2006.12.024</a>, 2007.
</mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>21</label><mixed-citation> Cutler, D. R., Edwards, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., and
Lawler, J. J.: Random forests for classification in ecology, Ecology, 88, 2783–2792,
<a href="https://doi.org/10.1890/07-0539.1" target="_blank">https://doi.org/10.1890/07-0539.1</a>, 2007.
</mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>22</label><mixed-citation> Davies-Colley, R. J., Nagels, J. W., Smith, R. A., Young, R. G., and Phillips, C. J.:
Water quality impact of a dairy cow herd crossing a stream, New Zeal. J. Mar. Freshw. Res., 38,
569–576, <a href="https://doi.org/10.1080/00288330.2004.9517262" target="_blank">https://doi.org/10.1080/00288330.2004.9517262</a>, 2004.
</mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>23</label><mixed-citation> Denvil-Sommer, A., Gehlen, M., Vrac, M., and Mejia, C.: LSCE-FFNN-v1: a two-step neural
network model for the reconstruction of surface ocean <i>p</i>CO<sub>2</sub> over the global ocean,
Geosci. Model Dev., 12, 2091–2105, <a href="https://doi.org/10.5194/gmd-12-2091-2019" target="_blank">https://doi.org/10.5194/gmd-12-2091-2019</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>24</label><mixed-citation> Domingues, R. B., Anselmo, T. P., Barbosa, A. B., Sommer, U., and Galvão, H. M.:
Nutrient limitation of phytoplankton growth in the freshwater tidal zone of a turbid,
Mediterranean estuary, Estuar. Coast. Shelf Sci., 91, 282–297, <a href="https://doi.org/10.1016/j.ecss.2010.10.033" target="_blank">https://doi.org/10.1016/j.ecss.2010.10.033</a>,
2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>25</label><mixed-citation> Erdal, H. I. and Karakurt, O.: Advancing monthly streamflow prediction accuracy of CART
models using ensemble learning paradigms, J. Hydrol., 477, 119–128,
<a href="https://doi.org/10.1016/j.jhydrol.2012.11.015" target="_blank">https://doi.org/10.1016/j.jhydrol.2012.11.015</a>, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>26</label><mixed-citation> Erlandsson, M., Cory, N., Fölster, J., Köhler, S., Laudon, H., Weyhenmeyer,
G. A., and Bishop, K.: Increasing Dissolved Organic Carbon Redefines the Extent of Surface Water
Acidification and Helps Resolve a Classic Controversy, Bioscience, 61, 614–618,
<a href="https://doi.org/10.1525/bio.2011.61.8.7" target="_blank">https://doi.org/10.1525/bio.2011.61.8.7</a>, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>27</label><mixed-citation> Filep, T. and Rékási, M.: Factors controlling dissolved organic carbon (DOC),
dissolved organic nitrogen (DON) and DOC/DON ratio in arable soils based on a dataset from
Hungary, Geoderma, 162, 312–318, <a href="https://doi.org/10.1016/j.geoderma.2011.03.002" target="_blank">https://doi.org/10.1016/j.geoderma.2011.03.002</a>, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>28</label><mixed-citation> Forio, M. A. E., Landuyt, D., Bennetsen, E., Lock, K., Nguyen, T. H. T., Ambarita,
M. N. D., Musonge, P. L. S., Boets, P., Everaert, G., Dominguez-Granda, L., and Goethals,
P. L. M.: Bayesian belief network models to analyse and predict ecological water quality in
rivers, Ecol. Modell., 312, 222–238, <a href="https://doi.org/10.1016/j.ecolmodel.2015.05.025" target="_blank">https://doi.org/10.1016/j.ecolmodel.2015.05.025</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>29</label><mixed-citation> Friedman, J.: Greedy Function Approximation: A Gradient Boosting Machine, Ann. Stat.,
29, 1189–1232, <a href="https://doi.org/10.1214/009053606000000795" target="_blank">https://doi.org/10.1214/009053606000000795</a>, 2001.
</mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>30</label><mixed-citation> Friedman, J. H.: Stochastic gradient boosting, Comput. Stat. Data Anal., 38, 367–378,
<a href="https://doi.org/10.1016/S0167-9473(01)00065-2" target="_blank">https://doi.org/10.1016/S0167-9473(01)00065-2</a>, 2002.
</mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>31</label><mixed-citation> Fuka, D., Walter, T., Archibald, J., Tammo, S., and Easton, Z.: EcoHydRology: A Community Modeling Foundation for Eco-Hydrology, R package version 0.4.12.1, available at: <a href="https://cran.r-project.org/web/packages/EcoHydRology" target="_blank"/> (last access: 9 September 2020), 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>32</label><mixed-citation> Furey, P. R. and Gupta, V. K.: A physically based filter for separating base flow from
streamflow time series, Water Resour. Res., 37, 2709–2722, <a href="https://doi.org/10.1029/2001WR000243" target="_blank">https://doi.org/10.1029/2001WR000243</a>, 2001.
</mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>33</label><mixed-citation> Giblin, A. E., Weston, N. B., Banta, G. T., Tucker, J., and Hopkinson, C. S.: The
Effects of Salinity on Nitrogen Losses from an Oligohaline Estuarine Sediment, Estuar. Coast., 33,
1054–1068, <a href="https://doi.org/10.1007/s12237-010-9280-7" target="_blank">https://doi.org/10.1007/s12237-010-9280-7</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>34</label><mixed-citation> Górniak, A., Zieliński, P., Jekatierynczuk-Rudczyk, E., Grabowska, M. and
Suchowolec, T.: The role of dissolved organic carbon in a shallow lowland reservoir ecosystem – A
long-term study, Acta Hydrochim. Hydrobiol., 30, 179–189, <a href="https://doi.org/10.1002/aheh.200390001" target="_blank">https://doi.org/10.1002/aheh.200390001</a>, 2002.
</mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>35</label><mixed-citation> Green, C. T., Bekins, B. a, Kalkhoff, S. J., Hirsch, R. M., Liao, L., and Barnes, K. K.:
Decadal surface water quality trends under variable climate, land use, and hydrogeochemical
setting in Iowa, USA, Water Resour. Res., 50, 2425–2443, <a href="https://doi.org/10.1002/2013WR014829" target="_blank">https://doi.org/10.1002/2013WR014829</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>36</label><mixed-citation> Greening, H., Janicki, A., Sherwood, E. T., Pribble, R., and Johansson, J. O.  R.:
Ecosystem responses to long-term nutrient management in an urban estuary: Tampa Bay, Florida, USA,
Estuar. Coast. Shelf Sci., 151, A1–A16, <a href="https://doi.org/10.1016/j.ecss.2014.10.003" target="_blank">https://doi.org/10.1016/j.ecss.2014.10.003</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>37</label><mixed-citation> Gunaratne, G. L., Vogwill, R. I. J., and Hipsey, M. R.: Effect of seasonal flushing on
nutrient export characteristics of an urbanizing, remote, ungauged coastal catchment,
Hydrol. Sci. J., 62, 800–817, <a href="https://doi.org/10.1080/02626667.2016.1264585" target="_blank">https://doi.org/10.1080/02626667.2016.1264585</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>38</label><mixed-citation> Guo, D., Lintern, A., Webb, J. A., Ryu, D., Liu, S., Bende-Michl, U., Leahy, P.,
Wilson, P., and Western, A. W.: Key Factors Affecting Temporal Variability in Stream Water
Quality, Water Resour. Res., 55, 112–129, <a href="https://doi.org/10.1029/2018WR023370" target="_blank">https://doi.org/10.1029/2018WR023370</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>39</label><mixed-citation> Halliday, S. J., Wade, A. J., Skeffington, R. A., Neal, C., Reynolds, B., Rowland, P.,
Neal, M., and Norris, D.: An analysis of long-term trends, seasonality and short-term dynamics in
water quality data from Plynlimon, Wales, Sci. Total Environ., 434, 186–200,
<a href="https://doi.org/10.1016/j.scitotenv.2011.10.052" target="_blank">https://doi.org/10.1016/j.scitotenv.2011.10.052</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>40</label><mixed-citation> Heathwaite, A. L.: Multiple stressors on water availability at global to catchment
scales: Understanding human impact on nutrient cycles to protect water quality and water
availability in the long term, Freshw. Biol., 55, 241–257,
<a href="https://doi.org/10.1111/j.1365-2427.2009.02368.x" target="_blank">https://doi.org/10.1111/j.1365-2427.2009.02368.x</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>41</label><mixed-citation> Herndon, E. M., Dere, A. L., Sullivan, P. L., Norris, D., Reynolds, B., and Brantley,
S. L.: Landscape heterogeneity drives contrasting concentration–discharge relationships in shale
headwater catchments, Hydrol. Earth Syst. Sci., 19, 3333–3347, <a href="https://doi.org/10.5194/hess-19-3333-2015" target="_blank">https://doi.org/10.5194/hess-19-3333-2015</a>,
2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>42</label><mixed-citation> Hirsch, R. M. and De Cicco, L.: User guide to Exploration and Graphics for RivEr Trends
(EGRET) and dataRetrieval: R packages for hydrologic data, Tech. Methods B, 4, 93,
<a href="https://doi.org/10.3133/tm4A10" target="_blank">https://doi.org/10.3133/tm4A10</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>43</label><mixed-citation> Hirsch, R. M., Moyer, D. L., and Archfield, S. A.: Weighted regressions on time,
discharge, and season (WRTDS), with an application to chesapeake bay river inputs, J. Am. Water
Resour. Assoc., 46, 857–880, <a href="https://doi.org/10.1111/j.1752-1688.2010.00482.x" target="_blank">https://doi.org/10.1111/j.1752-1688.2010.00482.x</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>44</label><mixed-citation> Holguin-Gonzalez, J. E., Everaert, G., Boets, P., Galvis, A., and Goethals, P. L. M.:
Development and application of an integrated ecological modelling framework to analyze the impact
of wastewater discharges on the ecological water quality of rivers, Environ. Model. Softw., 48,
27–36, <a href="https://doi.org/10.1016/j.envsoft.2013.06.004" target="_blank">https://doi.org/10.1016/j.envsoft.2013.06.004</a>, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>45</label><mixed-citation> Huang, P., Trayler, K., Wang, B., Saeed, A., Oldham, C., Busch, B., and Hipsey, M.: An
integrated modelling system for water quality forecasting in an urban eutrophic estuary: The
Swan-Canning Estuary virtual observatory, J.  Mar. Syst., 199, 103218,
<a href="https://doi.org/10.1016/j.jmarsys.2019.103218" target="_blank">https://doi.org/10.1016/j.jmarsys.2019.103218</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>46</label><mixed-citation> Hunter, J. M., Maier, H. R., Gibbs, M. S., Foale, E. R., Grosvenor, N. A., Harders,
N. P., and Kikuchi-Miller, T. C.: Framework for developing hybrid process-driven, artificial
neural network and regression models for salinity prediction in river systems, Hydrol. Earth
Syst. Sci., 22, 2987–3006, <a href="https://doi.org/10.5194/hess-22-2987-2018" target="_blank">https://doi.org/10.5194/hess-22-2987-2018</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>47</label><mixed-citation> Ishwaran, H. and Kogalur, U. B.: Consistency of random survival forests,
Stat. Probab. Lett., 80, 1056–1064, <a href="https://doi.org/10.1016/j.spl.2010.02.020" target="_blank">https://doi.org/10.1016/j.spl.2010.02.020</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>48</label><mixed-citation> Ismail, R. and Mutanga, O.: A comparison of regression tree ensembles: Predicting Sirex
noctilio induced water stress in Pinus patula forests of KwaZulu-Natal, South Africa,
Int. J. Appl. Earth Obs. Geoinf., 12, S45–S51, <a href="https://doi.org/10.1016/j.jag.2009.09.004" target="_blank">https://doi.org/10.1016/j.jag.2009.09.004</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>49</label><mixed-citation> Jickells, T. D., Andrews, J. E., Parkes, D. J., Suratman, S., Aziz, A. A., and Hee,
Y. Y.: Nutrient transport through estuaries: The importance of the estuarine geography,
Estuar. Coast. Shelf Sci., 150, 215–229, <a href="https://doi.org/10.1016/j.ecss.2014.03.014" target="_blank">https://doi.org/10.1016/j.ecss.2014.03.014</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>50</label><mixed-citation> Jordan, P. and Cassidy, R.: Technical Note: Assessing a 24/7 solution for monitoring
water quality loads in small river catchments, Hydrol. Earth Syst. Sci., 15, 3093–3100,
<a href="https://doi.org/10.5194/hess-15-3093-2011" target="_blank">https://doi.org/10.5194/hess-15-3093-2011</a>, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>51</label><mixed-citation> Kaiser, D., Unger, D., Qiu, G., Zhou, H., and Gan, H.: Natural and human influences on
nutrient transport through a small subtropical Chinese estuary, Sci. Total Environ., 450–451,
92–107, <a href="https://doi.org/10.1016/j.scitotenv.2013.01.096" target="_blank">https://doi.org/10.1016/j.scitotenv.2013.01.096</a>, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>52</label><mixed-citation> Kelsey, P., Hall, J., Kitsios, A., Quinton, B., and Shakya, D.: Hydrological and
nutrient modelling of the Swan-Canning coastal catchments, Water Science technical series,
Department of Water, Western Australia., 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>53</label><mixed-citation> Kelsey, P., Hall, J., Kretschmer, P., Quiton, B., and Shakya, D.: Hydrological and
nutrient modelling of the Peel-Harvey catchment, Water Science Technical Series, Department of
Water, Western Australia., 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>54</label><mixed-citation> Lamsal, S., Grunwald, S., Bruland, G. L., Bliss, C. M., and Comerford, N. B.: Regional
hybrid geospatial modeling of soil nitrate-nitrogen in the Santa Fe River Watershed, Geoderma,
135, 233–247, <a href="https://doi.org/10.1016/j.geoderma.2005.12.009" target="_blank">https://doi.org/10.1016/j.geoderma.2005.12.009</a>, 2006.
</mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>55</label><mixed-citation> Li, J.: Assessing spatial predictive models in the environmental sciences: Accuracy
measures, data variation and variance explained, Environ. Model.  Softw., 80, 1–8,
<a href="https://doi.org/10.1016/j.envsoft.2016.02.004" target="_blank">https://doi.org/10.1016/j.envsoft.2016.02.004</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>56</label><mixed-citation> Li, M., Xu, K., Watanabe, M., and Chen, Z.: Long-term variations in dissolved silicate,
nitrogen, and phosphorus flux from the Yangtze River into the East China Sea and impacts on
estuarine ecosystem, Estuar. Coast. Shelf Sci., 71, 3–12, <a href="https://doi.org/10.1016/j.ecss.2006.08.013" target="_blank">https://doi.org/10.1016/j.ecss.2006.08.013</a>, 2007.
</mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>57</label><mixed-citation> Li, M., Lee, Y. J., Testa, J. M., Li, Y., Ni, W., Kemp, W. M., and Di Toro, D. M.: What
drives interannual variability of hypoxia in Chesapeake Bay: Climate forcing versus nutrient
loading?, Geophys. Res. Lett., 43, 2127–2134, <a href="https://doi.org/10.1002/2015GL067334" target="_blank">https://doi.org/10.1002/2015GL067334</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib58"><label>58</label><mixed-citation> Li, R., Liu, S., Zhang, G., Ren, J., and Zhang, J.: Biogeochemistry of nutrients in an
estuary affected by human activities: The Wanquan River estuary, eastern Hainan Island, China,
Cont. Shelf Res., 57, 18–31, <a href="https://doi.org/10.1016/j.csr.2012.02.013" target="_blank">https://doi.org/10.1016/j.csr.2012.02.013</a>, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib59"><label>59</label><mixed-citation> Lintern, A., Webb, J. A., Ryu, D., Liu, S., Waters, D., Leahy, P., Bende-Michl, U., and
Western, A. W.: What are the key catchment characteristics affecting spatial differences in
riverine water quality?, Water Resour. Res., 54, 7252–7272, <a href="https://doi.org/10.1029/2017WR022172" target="_blank">https://doi.org/10.1029/2017WR022172</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib60"><label>60</label><mixed-citation> Liu, S. M., Li, L. W., Zhang, G. L., Liu, Z., Yu, Z., and Ren, J. L.: Impacts of human
activities on nutrient transports in the Huanghe (Yellow River) estuary, J. Hydrol., 430–431,
103–110, <a href="https://doi.org/10.1016/j.jhydrol.2012.02.005" target="_blank">https://doi.org/10.1016/j.jhydrol.2012.02.005</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib61"><label>61</label><mixed-citation> Lloyd, C. E. M., Freer, J. E., Collins, A. L., Johnes, P. J., and Jones, J.  I.:
Methods for detecting change in hydrochemical time series in response to targeted pollutant
mitigation in river catchments, J. Hydrol., 514, 297–312, <a href="https://doi.org/10.1016/j.jhydrol.2014.04.036" target="_blank">https://doi.org/10.1016/j.jhydrol.2014.04.036</a>,
2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib62"><label>62</label><mixed-citation> Lyne, V. and Hollick, M.: Stochastic time-variable
rainfall-runoff modelling, Institution of Engineers, Canberra, Australia, p. 89–93, 1979.
</mixed-citation></ref-html>
<ref-html id="bib1.bib63"><label>63</label><mixed-citation> Maier, H. R., Kapelan, Z., Kasprzyk, J., Kollat, J., Matott, L. S., Cunha, M. C.,
Dandy, G. C., Gibbs, M. S., Keedwell, E., Marchi, A., Ostfeld, A., Savic, D., Solomatine, D. P.,
Vrugt, J. A., Zecchin, A. C., Minsker, B. S., Barbour, E. J., Kuczera, G., Pasha, F., Castelletti,
A., Giuliani, M., and Reed, P. M.: Evolutionary algorithms and other metaheuristics in water
resources: current status, research challenges and future directions, Environ. Model. Softw., 62,
271–299, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib64"><label>64</label><mixed-citation> Makler-Pick, V., Gal, G., Gorfine, M., Hipsey, M. R., and Carmel, Y.: Sensitivity
analysis for complex ecological models – A new approach, Environ. Model. Softw., 26, 124–134,
<a href="https://doi.org/10.1016/j.envsoft.2010.06.010" target="_blank">https://doi.org/10.1016/j.envsoft.2010.06.010</a>, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib65"><label>65</label><mixed-citation> Martínez-Rojas, M., Marín, N., and Vila, M. A.: The role of information
technologies to address data handling in construction project management, J. Comput. Civ. Eng.,
30, 1–10, <a href="https://doi.org/10.1061/(ASCE)CP.1943-5487.0000538" target="_blank">https://doi.org/10.1061/(ASCE)CP.1943-5487.0000538</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib66"><label>66</label><mixed-citation> McBratney, A. B., Odeh, I. O. A., Bishop, T. F. A., Dunbar, M. S., and Shatar, T. M.:
An overview of pedometric techniques for use in soil survey, Geoderma, 97, 293–327,
<a href="https://doi.org/10.1016/S0016-7061(00)00043-4" target="_blank">https://doi.org/10.1016/S0016-7061(00)00043-4</a>, 2000.
</mixed-citation></ref-html>
<ref-html id="bib1.bib67"><label>67</label><mixed-citation> Mellander, P. E., Melland, A. R., Jordan, P., Wall, D. P., Murphy, P. N. C., and
Shortle, G.: Quantifying nutrient transfer pathways in agricultural catchments using high temporal
resolution data, Environ. Sci. Policy, 24, 44–57, <a href="https://doi.org/10.1016/j.envsci.2012.06.004" target="_blank">https://doi.org/10.1016/j.envsci.2012.06.004</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib68"><label>68</label><mixed-citation> Meshgi, A., Schmitter, P., Chui, T. F. M., and Babovic, V.: Development of a modular
streamflow model to quantify runoff contributions from different land uses in tropical urban
environments using Genetic Programming, J.  Hydrol., 525, 711–723,
<a href="https://doi.org/10.1016/j.jhydrol.2015.04.032" target="_blank">https://doi.org/10.1016/j.jhydrol.2015.04.032</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib69"><label>69</label><mixed-citation> Meybeck, M. and Moatar, F.: Daily variability of river concentrations and fluxes:
Indicators based on the segmentation of the rating curve, Hydrol.  Process., 26, 1188–1207,
<a href="https://doi.org/10.1002/hyp.8211" target="_blank">https://doi.org/10.1002/hyp.8211</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib70"><label>70</label><mixed-citation> Moatar, F., Abbott, B. W., Minaudo, C., Curie, F., and Pinay, G.: Elemental properties,
hydrology, and biology interact to shape concentration-discharge curves for carbon, nutrients,
sediment, and major ions, Water Resour. Res., 53, 1270–1287, <a href="https://doi.org/10.1002/2016WR019635" target="_blank">https://doi.org/10.1002/2016WR019635</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib71"><label>71</label><mixed-citation> Nathan, R. J. and McMahon, T. A.: Evaluation of automated techniques for base flow and
recession analyses, Water Resour. Res., 26, 1465–1473, <a href="https://doi.org/10.1029/WR026i007p01465" target="_blank">https://doi.org/10.1029/WR026i007p01465</a>,
1990.
</mixed-citation></ref-html>
<ref-html id="bib1.bib72"><label>72</label><mixed-citation> Nice, H., Foulsham, G., Bree, M., and Sarah, E.: A baseline study of contaminants in the sediments of the Swan and Canning
estuaries, Water Science technical series report no. 6, Department of Water, Western Australia, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib73"><label>73</label><mixed-citation> Noori, N. and Kalin, L.: Coupling SWAT and ANN models for enhanced daily streamflow
prediction, J. Hydrol., 533, 141–151, <a href="https://doi.org/10.1016/j.jhydrol.2015.11.050" target="_blank">https://doi.org/10.1016/j.jhydrol.2015.11.050</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib74"><label>74</label><mixed-citation> Petrone, K. C.: Catchment export of carbon, nitrogen, and phosphorus across an
agro-urban land use gradient, Swan-Canning River system, southwestern Australia, J. Geophys. Res.,
115, G01016, <a href="https://doi.org/10.1029/2009JG001051" target="_blank">https://doi.org/10.1029/2009JG001051</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib75"><label>75</label><mixed-citation> Petrone, K. C., Richards, J. S., and Grierson, P. F.: Bioavailability and composition
of dissolved organic carbon and nitrogen in a near coastal catchment of south-western Australia,
Biogeochemistry, 92, 27–40, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib76"><label>76</label><mixed-citation> Povak, N. A., Hessburg, P. F., McDonnell, T. C., Reynolds, K. M., Sullivan, T. J.,
Salter, R. B., and Cosby, B. J.: Machine learning and linear regression models to predict
catchment-level base cation weathering rates across the southern Appalachian Mountain region, USA,
Water Resour. Res., 50, 2798–2814, <a href="https://doi.org/10.1002/2013WR014222" target="_blank">https://doi.org/10.1002/2013WR014222</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib77"><label>77</label><mixed-citation> Puissant, A., Rougier, S., and Stumpf, A.: Object-oriented mapping of urban trees using
Random Forest classifiers, Int. J. Appl. Earth Obs. Geoinf., 26, 235–245,
<a href="https://doi.org/10.1016/j.jag.2013.07.002" target="_blank">https://doi.org/10.1016/j.jag.2013.07.002</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib78"><label>78</label><mixed-citation> Ruibal-Conti, A. L., Summers, R., Weaver, D., and Hipsey, M. R.: Hydro-climatological
non-stationarity shifts patterns of nutrient delivery to an estuarine system, Hydrol. Earth
Syst. Sci. Discuss., 10, 11035–11092, <a href="https://doi.org/10.5194/hessd-10-11035-2013" target="_blank">https://doi.org/10.5194/hessd-10-11035-2013</a>, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib79"><label>79</label><mixed-citation> Schnoor, J. L.: 4.1. Water quality and its sustainability introduction. in: Comprehensive Water Quality and Purification, edited by: Ahuja, S., Elsiever, Waltham, pp. 1–40, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib80"><label>80</label><mixed-citation> Seitzinger, S. P., Sanders, R. W., and Styles, R.: Bioavailability of DON from natural
and anthropogenic sources to estuarine plankton, Limnol.  Oceanogr., 47, 353–366,
<a href="https://doi.org/10.4319/lo.2002.47.2.0353" target="_blank">https://doi.org/10.4319/lo.2002.47.2.0353</a>, 2002.
</mixed-citation></ref-html>
<ref-html id="bib1.bib81"><label>81</label><mixed-citation> Singh, K. P., Gupta, S., and Mohan, D.: Evaluating influences of seasonal variations
and anthropogenic activities on alluvial groundwater hydrochemistry using ensemble learning
approaches, J. Hydrol., 511, 254–266, 2014.

</mixed-citation></ref-html>
<ref-html id="bib1.bib82"><label>82</label><mixed-citation> Staehr, P. A., Testa, J., and Carstensen, J.: Decadal Changes in Water Quality and Net
Productivity of a Shallow Danish Estuary Following Significant Nutrient Reductions,
Estuar. Coast., 40, 63–79, <a href="https://doi.org/10.1007/s12237-016-0117-x" target="_blank">https://doi.org/10.1007/s12237-016-0117-x</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib83"><label>83</label><mixed-citation> Stallard, R. F. and Murphy, S. F.: A Unified Assessment of Hydrologic and
Biogeochemical Responses in Research Watersheds in Eastern Puerto Rico Using Runoff-Concentration
Relations, Aquat. Geochemistry, 20, 115–139, <a href="https://doi.org/10.1007/s10498-013-9216-5" target="_blank">https://doi.org/10.1007/s10498-013-9216-5</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib84"><label>84</label><mixed-citation> Swan River Trust: Swan Canning Water Quality Improvement., 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib85"><label>85</label><mixed-citation> Szilagyi, J. and Parlange, M. B.: Baseflow separation based on analytical solutions of
the Boussinesq equation, J. Hydrol., 204, 251–260, <a href="https://doi.org/10.1016/S0022-1694(97)00132-7" target="_blank">https://doi.org/10.1016/S0022-1694(97)00132-7</a>, 1998.
</mixed-citation></ref-html>
<ref-html id="bib1.bib86"><label>86</label><mixed-citation> Tao, Y., Wei, M., Ongley, E., Li, Z., and Jingsheng, C.: Long-term variations and
causal factors in nitrogen and phosphorus transport in the Yellow River, China,
Estuar. Coast. Shelf Sci., 86, 345–351, <a href="https://doi.org/10.1016/j.ecss.2009.05.014" target="_blank">https://doi.org/10.1016/j.ecss.2009.05.014</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib87"><label>87</label><mixed-citation> Tesoriero, A. J., Duff, J. H., Wolock, D. M., Spahr, N. E., and Almendinger, J. E.:
Identifying Pathways and Processes Affecting Nitrate and Orthophosphate Inputs to Streams in
Agricultural Watersheds, J. Environ.  Qual., 38, 1892, <a href="https://doi.org/10.2134/jeq2008.0484" target="_blank">https://doi.org/10.2134/jeq2008.0484</a>, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib88"><label>88</label><mixed-citation> Testa, J. M., Clark, J. B., Dennison, W. C., Donovan, E. C., Fisher, A. W., Ni, W.,
Parker, M., Scavia, D., Spitzer, S. E., Waldrop, A. M., Vargas, V.  M. D., and Ziegler, G.:
Ecological Forecasting and the Science of Hypoxia in Chesapeake Bay, Bioscience, 67, 614–626,
<a href="https://doi.org/10.1093/biosci/bix048" target="_blank">https://doi.org/10.1093/biosci/bix048</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib89"><label>89</label><mixed-citation>
Wang, B.: benyawang-uwa/daily-nutrient-prediction: first release of daily nutrient prediction model (Version v1.0.0), Zenodo, <a href="https://doi.org/10.5281/zenodo.3739611" target="_blank">https://doi.org/10.5281/zenodo.3739611</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib90"><label>90</label><mixed-citation> Wang, B., Hipsey, M. R., Ahmed, S., and Oldham, C.: The Impact of Landscape
Characteristics on Groundwater Dissolved Organic Nitrogen: Insights From Machine Learning Methods
and Sensitivity Analysis, Water Resour. Res., 54, 4785–4804, <a href="https://doi.org/10.1029/2017WR021749" target="_blank">https://doi.org/10.1029/2017WR021749</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib91"><label>91</label><mixed-citation> Yang, P., Yang, Y. H., Zhou, B. B., and Zomaya, A. Y.: A Review of Ensemble Methods in
Bioinformatics, Curr. Bioinf., 5, 296–308, <a href="https://doi.org/10.2174/157489310794072508" target="_blank">https://doi.org/10.2174/157489310794072508</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib92"><label>92</label><mixed-citation> Zhang, Q., Harman, C. J., and Ball, W. P.: An improved method for interpretation of
riverine concentration-discharge relationships indicates long-term shifts in reservoir sediment
trapping, Geophys. Res. Lett., 43, 10215–10224, <a href="https://doi.org/10.1002/2016GL069945" target="_blank">https://doi.org/10.1002/2016GL069945</a>, 2016a.
</mixed-citation></ref-html>
<ref-html id="bib1.bib93"><label>93</label><mixed-citation> Zhang, Q., Ball, W. P., and Moyer, D. L.: Decadal-scale export of nitrogen, phosphorus,
and sediment from the Susquehanna River basin, USA: Analysis and synthesis of temporal and spatial
patterns, Sci. Total Environ., 563–564, 1016–1029, <a href="https://doi.org/10.1016/j.scitotenv.2016.03.104" target="_blank">https://doi.org/10.1016/j.scitotenv.2016.03.104</a>, 2016b.
</mixed-citation></ref-html>
<ref-html id="bib1.bib94"><label>94</label><mixed-citation> Zhang, Q., Hirsch, R. M., and Ball, W. P.: Long-Term Changes in Sediment and Nutrient
Delivery from Conowingo Dam to Chesapeake Bay: Effects of Reservoir Sedimentation,
Environ. Sci. Technol., 50, 1877–1886, <a href="https://doi.org/10.1021/acs.est.5b04073" target="_blank">https://doi.org/10.1021/acs.est.5b04073</a>, 2016c.
</mixed-citation></ref-html>--></article>
