<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" dtd-version="3.0"><?xmltex \makeatother\@nolinetrue\makeatletter?>
  <front>
    <journal-meta>
<journal-id journal-id-type="publisher">GMD</journal-id>
<journal-title-group>
<journal-title>Geoscientific Model Development</journal-title>
<abbrev-journal-title abbrev-type="publisher">GMD</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">Geosci. Model Dev.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">1991-9603</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>

    <article-meta>
      <article-id pub-id-type="doi">10.5194/gmd-10-3391-2017</article-id><title-group><article-title>A Bayesian framework based on a Gaussian mixture model and radial-basis-function Fisher discriminant analysis (BayGmmKda V1.1) for spatial prediction of floods</article-title>
      </title-group><?xmltex \runningtitle{BayGmmKda V1.1}?><?xmltex \runningauthor{D. Tien Bui and N.-D. Hoang}?>
      <contrib-group>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Tien Bui</surname><given-names>Dieu</given-names></name>
          
        <ext-link>https://orcid.org/0000-0001-5161-6479</ext-link></contrib>
        <contrib contrib-type="author" corresp="yes" rid="aff2">
          <name><surname>Hoang</surname><given-names>Nhat-Duc</given-names></name>
          <email>hoangnhatduc@dtu.edu.vn</email>
        </contrib>
        <aff id="aff1"><label>1</label><institution>Geographic Information System Group, Department of Business
and IT, University College of Southeast Norway (USN),
Gullbringvegen 36, 3800, Bø i Telemark, Norway</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>Faculty of Civil Engineering, Institute of Research and Development,
Duy Tan University, <?xmltex \hack{\break}?>P809 – K7/25 Quang Trung, Danang, Vietnam</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Nhat-Duc Hoang (hoangnhatduc@dtu.edu.vn)</corresp></author-notes><pub-date><day>14</day><month>September</month><year>2017</year></pub-date>
      
      <volume>10</volume>
      <issue>9</issue>
      <fpage>3391</fpage><lpage>3409</lpage>
      <history>
        <date date-type="received"><day>19</day><month>December</month><year>2016</year></date>
           <date date-type="rev-request"><day>17</day><month>January</month><year>2017</year></date>
           <date date-type="rev-recd"><day>17</day><month>July</month><year>2017</year></date>
           <date date-type="accepted"><day>10</day><month>August</month><year>2017</year></date>
      </history>
      <permissions>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/3.0/">https://creativecommons.org/licenses/by/3.0/</ext-link></license-p>
</license>
</permissions><self-uri xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017.html">This article is available from https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017.html</self-uri>
<self-uri xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017.pdf">The full text article is available as a PDF file from https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017.pdf</self-uri>


      <abstract>
    <p>In this study, a probabilistic model, named as BayGmmKda, is
proposed for flood susceptibility assessment in a study area in central
Vietnam. The new model is a Bayesian framework constructed by a combination
of a Gaussian mixture model (GMM), radial-basis-function Fisher discriminant
analysis (RBFDA), and a geographic information system (GIS) database. In the
Bayesian framework, GMM is used for modeling the data distribution of
flood-influencing factors in the GIS database, whereas RBFDA is utilized to
construct a latent variable that aims at enhancing the model performance. As
a result, the posterior probabilistic output of the BayGmmKda model is used
as flood susceptibility index. Experiment results showed that the proposed
hybrid framework is superior to other benchmark models, including the
adaptive neuro-fuzzy inference system and the support vector machine. To
facilitate the model implementation, a software program of BayGmmKda has
been developed in MATLAB. The BayGmmKda program can accurately establish a
flood susceptibility map for the study region. Accordingly, local
authorities can overlay this susceptibility map onto various land-use maps
for the purpose of land-use planning or management.</p>
  </abstract>
    </article-meta>
  </front>
<body>
      

      <?xmltex \hack{\newpage}?>
<sec id="Ch1.S1" sec-type="intro">
  <title>Introduction</title>
      <p>Flooding is one of the most destructive natural hazards that cause heavy loss of
human lives and property in immense spatial extent (Dottori et al., 2016;
Komi et al., 2017). Recent statistics on flood damages for the period of
1995–2015 shows that flooding affected 109 million people around the globe per
year (Alfieri et al., 2017) and killed more than 220 000 people (Winsemius
et al., 2015). Although the frequency of flooding has decreased in several regions (i.e.,
in central Asia and America), flood occurrences have increased globally by
42 % (Hirabayashi et al., 2013).</p>
      <p>Notably, Southeast Asia is one of the most heavily flood-damaged regions in
the world due to monsoonal rainfalls and tropical hurricane patterns (Loo et
al., 2015). Located in this region, Vietnam is a storm center on the western
Pacific, and this nation has faced the destructive consequence of flooding in
many of its provinces. In Vietnam, floods are often triggered by tropical
cyclones. More than 71 % of the Vietnam's population and 59 % of the
total land area of Vietnam are susceptible to the impacts of these natural
hazards (Tien Bui et al., 2016c). Based on a report by Kreft et
al. (2014), from 1994 to 2013, Vietnam endured an annual economic loss that
is equivalent to USD 2.9 billion.</p>
      <p>Additionally, the occurrences of flood in Vietnam are expected to rise
rapidly in the near future due to the increases in poorly planned
infrastructure developments and urbanization near watercourses, as well as an
increased deforestation and climate change. Hence, an accurate
model for evaluating flood hazards for land-use planning becomes a crucial
need for land-use planning as well as establishment of disaster mitigation
strategies. Based on flood prediction models, flood-prone areas can be
identified and mapped (Tien Bui et al., 2016c).</p>
      <p>Needless to say, the identification of susceptible areas can significantly
reduce flood damage to the national economy and human lives by avoiding
infrastructure developments and densely populated settlements in highly flood-susceptible areas (Zhou et al., 2016). This identification also
helps government agencies to issue appropriate flood management policies and to focus
its limited financial resources on constructing large-scale flood defense
infrastructure in areas that have great economic value but are highly
susceptible to flood (Bubeck et al., 2012; Mason et al., 2010). Therefore, a
tool for spatial flood modeling is of great usefulness.</p>
      <p>To predict flood occurrence, conventional approaches require time series of
meteorological and streamflow data at gauging stations (Machado et al.,
2015). However, this is difficult for many areas in developing countries
where no gauging stations are available. Therefore, new modeling approaches
should be explored and investigated. Given these motivations, this study
proposes a novel methodology designed for achieving a high prediction
accuracy as well as deriving probabilistic evaluations of flood
susceptibility on a regional scale. Accordingly, spatial prediction of
flooding is carried out based on a statistical assumption that flooding in the future
will occur under the same conditions that triggered them in the past (Tien
Bui et al., 2016b). In this way, the flood prediction problem boils down to
an on–off supervised classification task, where flood inventories are used to
define the class of flood occurrence. Moreover, the class nonflood
occurrence is derived from areas that have not yet been damaged by flooding.
Consequently, spatial prediction of flooding within the study area is achieved
based on the probability of pixels belonging to the class of flood
occurrences. To yield probabilistic outputs of flood susceptibility, this
study proposes a Bayesian framework established on the basis of an
integration of a Gaussian mixture model (GMM) and the kernel Fisher
discriminant analysis (KFDA). GMM is employed for density approximation to
calculate the posterior probability of flood (flood susceptibility index); in
addition, KFDA constructs a latent variable based on the geoenvironmental
conditions to enhance the performance of the Bayesian model.</p>
      <p>In essence, the proposed integrated framework contains two phases of
analysis. RBFDA is first employed for latent variable construction. The
Bayesian approach assisted by GMM is then used to perform probabilistic
pattern recognition. The first level performs pattern discriminant analysis
tasks and the second level carries out the prediction process to derive the
model output of flood evaluation. Based on previous studies which indicate
that hierarchical model structures can produce improved prediction
accuracy, the proposed framework could potentially bring about desirable flood
assessment results. The subsequent parts of this study are organized in the
following order: related works on flood prediction are summarized in Sect. 2. The next section introduces the research method of the
current paper, followed by Sect. 4 which describes the proposed
Bayesian model for flood susceptibility forecasting. Section 5 reports
the model prediction accuracy and comparison. The last section discusses
some conclusions on this work.</p>
</sec>
<sec id="Ch1.S2">
  <title>A review of related works on flood susceptibility prediction</title>
      <p>Because of the criticality of flood prediction, this problem has gained an
increasing attention from the academic community. Following this trend,
various flood analyzing tools have been developed (Winsemius et al., 2013;
Papaioannou et al., 2015; Gao et al., 2017; Alfieri et al., 2014). Basically,
these tools could be classified into statistical analysis, rainfall–runoff
models, and classification models. Statistical analysis uses long-term
recorded time series data at gauged stations to establish regression models;
accordingly, the constructed regression models are used to transform flood
information to ungauged basins (Yue et al., 1999; Cunnane, 1988; McCuen,
2016). Thus, these models are capable of providing discharge predictions both
in space and time. However, long-term data are not always available; in many
cases, they are generally too short for reliable estimations of extreme
quantiles (Seckin et al., 2013b; Nguyen et al., 2014).</p>
      <p>Rainfall–runoff models, which deal with estimation of runoff from rainfall,
are considered to be the most extensively used approach for flood prediction
and management (Nayak et al., 2013; Ciabatta et al., 2016; Bennett et al.,
2016). Various types of rainfall–runoff models can be found in the literature,
varying from empirical models to highly sophisticated physical processes.
Empirical models could be established based on statistical techniques (Brocca
et al., 2011; Neal et al., 2013) or advanced machine learning algorithms
(Lohani et al., 2011); such models can be effectively employed to analyze
rainfall and runoff on the basis of historical time series data. In addition,
physical-process models focus on simulating hydrological processes in a
basin based on a set of mathematical equations governing physical processes
of water flow and surfaces (Aronica et al., 2012; Chiew et al., 1993; Beven
et al., 1984; Birkel et al., 2010; Grimaldi et al., 2013). In general,
rainfall–runoff models require relatively long-term time series data at
gauging stations. However, the density of gauging stations in developing
countries is very low and this fact creates a great obstacle to the
establishment of accurate hydrological models (Fenicia et al., 2008). In
addition, large-scale field works and deployments of measuring equipment are
necessary for collecting data.</p>
      <p>In recent years, a new flood modeling approach called “on–off”
classification of flood occurrence has been successfully proposed for spatial
prediction of flood (or alternatively called a flood susceptibility index; Tien Bui et al., 2016d; Tehrany et al., 2014, 2015b). Accordingly, no time
series data are required for the model calibration, and the establishment of
flood models is based on flood inventories (flood class) and nonflood areas
(nonflood class). Accordingly, the probability of a pixel in the study area
belonging to the flood class is used as flood susceptibility index. Moreover,
it is noted that the results of the model depend on the collection of
sufficient training data. Although the flood susceptibility map provides no
temporal prediction or return period of flood, the flood map is capable
delineating highly susceptible areas. Thus, it is a powerful flood analysis
tool for decision-makers that could be used in land-use planning and flood
management.</p>
      <p>The literature review shows that data-driven methods integrated with GIS
databases have demonstrated their effectiveness and accuracy in large-scale
flood susceptible predictions. An fuzzy-logic-based algorithm, established by
Pulvirenti et al. (2011), has been used to develop a map of flooded areas
from synthetic aperture radar imagery; this algorithm is used for the
operational flood management system in Italy. A model based on the frequency
ratio approach and GIS for spatial prediction of flooded regions was first
introduced by Lee et al. (2012); the spatial database was constructed by
field surveys and maps of the topography, geology, land cover, and
infrastructure.</p>
      <p>Prediction models with artificial neural networks (ANNs) have been employed for
flood susceptibility evaluation by various scholars (Kia et al., 2012; Seckin
et al., 2013a; Rezaeianzadeh et al., 2014; Radmehr and Araghinejad, 2014);
previous works have shown that an ANN is a capable nonlinear modeling tool.
Nevertheless, ANN learning is prone to overfitting, and its performance has
been shown to be inferior to that of support vector machines (SVMs; Hoang and Pham,
2016). Kazakis et al. (2015) introduced a multicriteria index to assess
flood hazard areas that relies on GIS and analytical hierarchy processes (AHPs);
in this methodology, the relative importance of each flood-influencing
factor for the occurrence and severity of flood was determined via AHP.
More recently, support-vector-machine-based flood susceptibility analysis
approaches have been proposed by Tehrany et al. (2015a, b); the research
finding is that SVM is more accurate than other benchmark models, including
the decision tree classifier and the conventional frequency ratio model.</p>
      <p>Mukerji et al. (2009) constructed flood forecasting models based on an
adaptive neuro-fuzzy interference system (ANFIS), genetic algorithm optimized
ANFIS; experiments demonstrated that ANFIS attained the most desirable
accuracy. Recently, a metaheuristic optimized neuro-fuzzy
inference system,
named as MONF, has been introduced by Tien Bui et al. (2016c); this research
pointed out that MONF is more capable than decision tree, ANN, SVM, and
conventional ANFIS methods.</p>
      <p>As can be seen from the literature review, various data-driven and advanced
soft-computing approaches have been proposed to construct different flood
forecasting models. In most previous studies, the flood prediction was
formulated as a binary pattern recognition problem in which the model output
is either flood or no flood. Probabilistic models have rarely been examined
to cope with the complexity as well as uncertainty of the problem under
concern. Therefore, our research aims to enrich the body of knowledge by
proposing a novel Bayesian probabilistic model to estimate the flood
vulnerability with the use of a GIS database.</p>
</sec>
<sec id="Ch1.S3">
  <title>Research method</title>
<sec id="Ch1.S3.SS1">
  <title>Flood inventory map and flood-influencing factors of the study
area</title>
<sec id="Ch1.S3.SS1.SSS1">
  <title>The study area</title>
      <p>In this research, Tuong Duong district (central Vietnam) is selected as the
study area (see Fig. 1). This is by far one of the most heavily affected flood regions in
the country (Reynaud and Nguyen, 2016). The area of the district is
approximately 2803 km<inline-formula><mml:math id="M1" display="inline"><mml:msup><mml:mi/><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:math></inline-formula>. The district is located between the longitudes of
18<inline-formula><mml:math id="M2" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>58<inline-formula><mml:math id="M3" display="inline"><mml:msup><mml:mi/><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula>42<inline-formula><mml:math id="M4" display="inline"><mml:msup><mml:mi/><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> N and 19<inline-formula><mml:math id="M5" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>39<inline-formula><mml:math id="M6" display="inline"><mml:msup><mml:mi/><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula>16<inline-formula><mml:math id="M7" display="inline"><mml:msup><mml:mi/><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> N and between the
latitudes of 104<inline-formula><mml:math id="M8" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>15<inline-formula><mml:math id="M9" display="inline"><mml:msup><mml:mi/><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula>58<inline-formula><mml:math id="M10" display="inline"><mml:msup><mml:mi/><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> E and 104<inline-formula><mml:math id="M11" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>55<inline-formula><mml:math id="M12" display="inline"><mml:msup><mml:mi/><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula>57<inline-formula><mml:math id="M13" display="inline"><mml:msup><mml:mi/><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> E. The
topographical features of the Tuong Duong district are inherently complex, with
mountainous areas, watersheds, and rivers. Drastic floods often divided the
district into several isolated areas which are very difficult to
approach for rescuing or evacuation purposes.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F1" specific-use="star"><caption><p>Location of the Tuong Duong district (central Vietnam).</p></caption>
            <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f01.png"/>

          </fig>

      <p>The district has two separated seasons, namely a cold season (from November
to March) and a hot season (from April to October). The yearly rainfall of
the district is within the range of 1679–3259 mm. The rainfall amount is
primarily intensified during the rainy period which contributes to roughly
90 % of the total annual rainfall. Due to the district's location as well
as its topographic and climatic features, the study area is highly
susceptible to flood events with immense effects on the rate of human casualties and
economic loss. An examination carried out by Reynaud and Nguyen (2016)
reported that approximately 40 % of families have been affected by floods
and roughly 20 % of families must be relocated away from the flooded
areas; the average loss from flooding is up to 24 % of the family income
each year.</p>
</sec>
<sec id="Ch1.S3.SS1.SSS2">
  <title>Flood inventory map</title>
      <p>Prediction of flood zones can be based on an assumption that future flood
events are governed by the very similar conditions of flooded zones in the
past. Therefore, flood inventories and the geoenvironmental conditions
(e.g., topological and hydrological features) that produced them must be
extensively determined and collected (Tien Bui et al., 2016c; Tehrany et al.,
2015b). The first step of this analysis is to establish a flood inventory map
for the region under investigation. In this study, the flood inventory map
established by Tien Bui et al. (2016c) was used to analyze the relationships
between flood occurrences and influencing factors.<?xmltex \hack{\newpage}?></p>
      <p>The flood inventory map stores documentations of past flood events (see
Fig. 1). It is noted that the type of floods in this study area are flash
floods. This is the main flood type in this region due to characteristics of
the terrain. The map was constructed by gathering information of the study
area, field works at flood areas, and analyses from results of the Landsat-8
operational land imagery (from 2010 to 2014) with a resolution of 30 m
(retrieved from <uri>http://earthexplorer.usgs.gov</uri>). Furthermore, the
location of flood events was also verified by field works carried out in 2014
with handhold GPS devices. In summary, the total number of flood locations
during the last 5 years was recorded to be 76. It is noted that flood
locations were determined by overlaying the flood polygons in the inventory
map and the digital elevation model (DEM). Moreover, only pixels in the map
that are associated with flood points are used to extract the influencing factors
used for flood prediction.</p>
      <p>Although the data for this study were collected from 2010 to 2014, there were
recurrent flash floods which occurred during tropical typhoons in this period. Thus, it is
reasonable to conclude that all significant flash flood locations in the
study area have been revealed and determined. It should be noted that due to
the statistical assumption used in this study, the inclusion of flood
locations in the distant past (i.e., before the year of 2009) for flood
susceptibility analysis may cause bias. It is because the construction of new
hydropower dams such as Ban Ve (from 2010) and Nam Non (from 2011) and
deforestation or forestation have changed the geoenvironmental conditions in
the study area (Dao, 2017; Manley et al., 2013). In other words, the
geoenvironmental conditions of the distant past are very different to those of
the present time; therefore, flood locations in the distant past should not be
included in the current analysis.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T1" specific-use="star"><caption><p>Flood-influencing factors and their categories.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="3">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="justify" colwidth="256.074803pt"/>
     <oasis:thead>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Factors</oasis:entry>  
         <oasis:entry colname="col2">Coding</oasis:entry>  
         <oasis:entry colname="col3">Description of factor categories</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Slope (<inline-formula><mml:math id="M14" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>)</oasis:entry>  
         <oasis:entry colname="col2">IF<inline-formula><mml:math id="M15" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col3">1 (0–0.5); 2 (0.5–2); 3 (2–5); 4 (5–8); 5 (8–13); 6 (13–20); 7 (20–30); <?xmltex \hack{\hfill\break}?>8 (<inline-formula><mml:math id="M16" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 30)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Elevation (100 m)</oasis:entry>  
         <oasis:entry colname="col2">IF<inline-formula><mml:math id="M17" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col3">1 (<inline-formula><mml:math id="M18" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 1); 2 (1–2); 3 (2–3); 4 (3–4); 5 (4–5); 6 (5–6); 7 (6–7); 8 (7–10); <?xmltex \hack{\hfill\break}?>9 (10–13); 10 (<inline-formula><mml:math id="M19" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 13)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Curvature</oasis:entry>  
         <oasis:entry colname="col2">IF<inline-formula><mml:math id="M20" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">3</mml:mn></mml:msub></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col3">1 (<inline-formula><mml:math id="M21" display="inline"><mml:mrow><mml:mo>&lt;</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula>); 2 (<inline-formula><mml:math id="M22" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula> to <inline-formula><mml:math id="M23" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.05</mml:mn></mml:mrow></mml:math></inline-formula>) ; 3 (<inline-formula><mml:math id="M24" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.05</mml:mn></mml:mrow></mml:math></inline-formula>–0.05); 4 (0.05–2); 5 (<inline-formula><mml:math id="M25" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 2)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Topographic wetness index (TWI)</oasis:entry>  
         <oasis:entry colname="col2">IF<inline-formula><mml:math id="M26" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col3">1 (<inline-formula><mml:math id="M27" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 6.5); 2 (6.5–7.5); 3 (7.5–8.5); 4 (8.5–9.5); 5 (9.5–10.5);<?xmltex \hack{\hfill\break}?>6 (10.5–11.5); 7 (11.5–12.5); 8 (<inline-formula><mml:math id="M28" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 12.5)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Stream power index (SPI)</oasis:entry>  
         <oasis:entry colname="col2">IF<inline-formula><mml:math id="M29" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">5</mml:mn></mml:msub></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col3">1 (<inline-formula><mml:math id="M30" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 1); 2 (1–3); 3 (3–5); 4 (5–7); 5 (7 to10); 6 (10–15); 7 (15–20); <?xmltex \hack{\hfill\break}?>8 (20–30); 9 (30–50); 10 (<inline-formula><mml:math id="M31" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 50)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Distance to river (m)</oasis:entry>  
         <oasis:entry colname="col2">IF<inline-formula><mml:math id="M32" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">6</mml:mn></mml:msub></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col3">1 (<inline-formula><mml:math id="M33" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 40); 2 (40–80); 3 (80–120); 4 (120–200); 5 (200–400);<?xmltex \hack{\hfill\break}?>6 (400–700); 7 (700–1500); 8 (<inline-formula><mml:math id="M34" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 1500)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Stream density (km km<inline-formula><mml:math id="M35" display="inline"><mml:msup><mml:mi/><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>)</oasis:entry>  
         <oasis:entry colname="col2">IF<inline-formula><mml:math id="M36" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">7</mml:mn></mml:msub></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col3">1 (<inline-formula><mml:math id="M37" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 1); 2 (1–3); 3 (3–5); 4 (5–7); 5 (7–9); 6 (<inline-formula><mml:math id="M38" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 9)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Normalized difference vegetation index (NDVI)</oasis:entry>  
         <oasis:entry colname="col2">IF<inline-formula><mml:math id="M39" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">8</mml:mn></mml:msub></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col3">1 (<inline-formula><mml:math id="M40" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 0.3); 2 (0.3–0.35); 3 (0.35–0.4); 4 (0.4–0.45); 5 (0.45–0.5);<?xmltex \hack{\hfill\break}?>6 (0.5–0.55); 7 (0.55–0.6); 8 (<inline-formula><mml:math id="M41" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 0.6)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Lithology (rock type)</oasis:entry>  
         <oasis:entry colname="col2">IF<inline-formula><mml:math id="M42" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">9</mml:mn></mml:msub></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col3">1 (Q); 2 (Nkb); 3 (Jmh); 4 (T3npb); 5 (T2); 6 (C-bslk); 7 (D-ntdl); <?xmltex \hack{\hfill\break}?>8 (S2-D1hn); 9 (O3-S1sc3); 10 (O3-S1sc2); 11 (O3-S1sc1); 12 (PR2bk)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">Rainfall (1000 mm)</oasis:entry>  
         <oasis:entry colname="col2">IF<inline-formula><mml:math id="M43" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">10</mml:mn></mml:msub></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col3">1 (<inline-formula><mml:math id="M44" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 1.82); 2 (1.82–1.92); 3 (1.92–2.02); 4 (2.02–2.12); 5 (2.12–2.22); 6 (2.22–2.32); 7 (2.32–2.42); 8 (<inline-formula><mml:math id="M45" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 2.42)</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
<sec id="Ch1.S3.SS1.SSS3">
  <title>Flood-influencing factors</title>
      <p>To construct a flood prediction model, besides the flood inventory map, it is
crucial to determine the flood-influencing factors (Tehrany et al., 2015a).
It is proper to note that the selection of the flood-governing factors varies
due to different characteristics of study areas and the availability of data
(Papaioannou et al., 2015). Based on the previous work of Tien Bui et
al. (2016c), the physical relationships between influencing factors and flood
processes have been analyzed. Accordingly, a total of 10 influencing factors
were selected in this study; they include slope (IF<inline-formula><mml:math id="M46" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:math></inline-formula>), elevation
(IF<inline-formula><mml:math id="M47" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:math></inline-formula>), curvature (IF<inline-formula><mml:math id="M48" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">3</mml:mn></mml:msub></mml:math></inline-formula>), topographic wetness index (TWI; IF<inline-formula><mml:math id="M49" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:math></inline-formula>),
stream power index (SPI; IF<inline-formula><mml:math id="M50" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">5</mml:mn></mml:msub></mml:math></inline-formula>), distance to river (IF<inline-formula><mml:math id="M51" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">6</mml:mn></mml:msub></mml:math></inline-formula>), stream
density (IF<inline-formula><mml:math id="M52" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">7</mml:mn></mml:msub></mml:math></inline-formula>), normalized difference vegetation index (NDVI; IF<inline-formula><mml:math id="M53" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">8</mml:mn></mml:msub></mml:math></inline-formula>),
lithology (IF<inline-formula><mml:math id="M54" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">9</mml:mn></mml:msub></mml:math></inline-formula>), and rainfall (IF<inline-formula><mml:math id="M55" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">10</mml:mn></mml:msub></mml:math></inline-formula>). These factors are used to
analyze the flood vulnerability for the studied area, and a GIS database
consisting of the flood inventory map and the chosen factors has been
established. The description of the 10 influencing factors of flood occurrence
employed in this study is summarized in Table 1. The distributions of the 10 factors within the studied region are illustrated in Fig. 2.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F2" specific-use="star"><caption><p> </p></caption>
            <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f02-part01.png"/>

          </fig>

<?xmltex \hack{\addtocounter{figure}{-1}}?><?xmltex \floatpos{t}?><fig id="Ch1.F3" specific-use="star"><caption><p>Flood-influencing factors: <bold>(a)</bold> slope,
<bold>(b)</bold> elevation, <bold>(c)</bold> curvature, <bold>(d)</bold> topographic
wetness index, <bold>(e)</bold> stream power index, <bold>(f)</bold> distance to
river, <bold>(g)</bold> stream density, <bold>(h)</bold> normalized difference
vegetation index, <bold>(i)</bold> lithology, and <bold>(j)</bold> rainfall.</p></caption>
            <?xmltex \igopts{width=426.791339pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f02-part02.png"/>

          </fig>

</sec>
</sec>
<sec id="Ch1.S3.SS2">
  <title>Bayesian framework for flood classification</title>
      <p>The flood prediction in this study is considered as a pattern classification
problem within which “flood” and “nonflood” are the two class labels of
interest. As a result, the probability (posterior probability) of pixels
belonging to the flood class, which are derived from the model, will be used
as susceptibility indices. These susceptibility indices of the pixels are
then used to generate the flood susceptibility map. To cope with the
complexity as well as the uncertainty of the problem of interest, a Bayesian
framework is employed in this study to evaluate the flood susceptibility of
each data sample. Figure 3 demonstrates the general concept of the Bayesian
framework used for classification.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F4"><caption><p>General concept of the Bayesian Framework for flood classification.</p></caption>
          <?xmltex \igopts{width=236.157874pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f03.pdf"/>

        </fig>

      <p>The Bayesian framework provides a flexible way for probabilistic modeling.
This method features a strong ability for dealing with uncertainty and noisy
data (Theodoridis, 2015; Cheng and Hoang, 2016). Nevertheless, previous
studies have rarely examined the capability of this approach for inferring
flood susceptibility. Basically, pattern classification aims at assigning a
pattern to one of <inline-formula><mml:math id="M56" display="inline"><mml:mrow><mml:mi>M</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula> distinctive class labels <inline-formula><mml:math id="M57" display="inline"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, in which <inline-formula><mml:math id="M58" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> is
either 1 or 2. <inline-formula><mml:math id="M59" display="inline"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M60" display="inline"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula> denote the flood class and the
nonflood class, respectively. To recognize an input pattern based on the
information supplied by its feature vector <inline-formula><mml:math id="M61" display="inline"><mml:mi mathvariant="bold-italic">X</mml:mi></mml:math></inline-formula>, we need to attain the
posterior probability <inline-formula><mml:math id="M62" display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, which indicates the likelihood that the
feature vector <inline-formula><mml:math id="M63" display="inline"><mml:mi mathvariant="bold-italic">X</mml:mi></mml:math></inline-formula> falls into a certain group <inline-formula><mml:math id="M64" display="inline"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. Based on such
information, the pattern will be categorized to the group with the highest
posterior probability. The posterior probability <inline-formula><mml:math id="M65" display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is calculated
as follows (Webb and Copsey, 2011):
            <disp-formula id="Ch1.E1" content-type="numbered"><mml:math id="M66" display="block"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>×</mml:mo><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M67" display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes the posterior probability. The term <inline-formula><mml:math id="M68" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>
represents the likelihood, which is also called the class-conditional
probability density function (PDF). <inline-formula><mml:math id="M69" display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes the prior
probability, which implies the probability of the class before any feature is
measured. The denominator <inline-formula><mml:math id="M70" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the evidence factor; this quantity is
merely a scale factor for guaranteeing that the posterior probabilities are
valid; it can be calculated as follows:
            <disp-formula id="Ch1.E2" content-type="numbered"><mml:math id="M71" display="block"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:munderover><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>×</mml:mo><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p>Generally, the prior probabilities <inline-formula><mml:math id="M72" display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> can be calculated by computing
the ratio of training instances in each class. Thus, the bulk of establishing
a Bayesian classification model is the calculation of the likelihood <inline-formula><mml:math id="M73" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.
This likelihood expresses the density of input patterns in the learning space
within a certain group of data. In most of situations, <inline-formula><mml:math id="M74" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is
unknown and must be estimated from the available data. In this research, the
Gaussian mixture model is utilized for computing the class-conditional
probability density function <inline-formula><mml:math id="M75" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
</sec>
<sec id="Ch1.S3.SS3">
  <title>Gaussian mixture model for density estimation</title>
<sec id="Ch1.S3.SS3.SSS1">
  <title>Gaussian mixture model</title>
      <p>It is noted that the posterior probability value (Eq. 1) for each pixel of
the study area is used as flood susceptibility index. To obtain the posterior
probability, the class-conditional PDF must be
estimated. This section presents how PDF is estimated by a Gaussian mixture model. A GMM is selected in this research because it has been shown to be
an effective parametric method for modeling of data distribution, especially
in high-dimensional space (McLachlan and Peel, 2000; Theodoridis and
Koutroumbas, 2009). Previous studies (Paalanen, 2004; Figueiredo and Jain,
2002; Gómez-Losada et al., 2014; Arellano and Dahyot, 2016) point out
that any continuous distribution can be approximated arbitrarily well by a
finite mixture of Gaussian distributions. Due to their usefulness as a
flexible modeling tool, GMMs have received an increasing amount of attention from the
academic community (Zhang et al., 2016; Khanmohammadi and Chou, 2016; Ju and
Liu, 2012).</p>
      <p>In a <inline-formula><mml:math id="M76" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula>-dimensional space the Gaussian PDF is defined mathematically in the
following form:

                  <disp-formula specific-use="align" content-type="numbered"><mml:math id="M77" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E3"><mml:mtd/><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi>N</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mtr><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mo>(</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">π</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mi>d</mml:mi><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="normal">Σ</mml:mi><mml:msup><mml:mi mathvariant="normal">|</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mi>exp⁡</mml:mi><mml:mfenced open="{" close="}"><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>-</mml:mo><mml:mi mathvariant="italic">μ</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:msup><mml:mi mathvariant="normal">Σ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>-</mml:mo><mml:mi mathvariant="italic">μ</mml:mi><mml:mo>)</mml:mo></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>

              where <inline-formula><mml:math id="M78" display="inline"><mml:mi mathvariant="italic">μ</mml:mi></mml:math></inline-formula> denotes the vector of variable mean, <inline-formula><mml:math id="M79" display="inline"><mml:mi mathvariant="normal">Σ</mml:mi></mml:math></inline-formula> represents the
matrix of covariance, and <inline-formula><mml:math id="M80" display="inline"><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>=</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mi mathvariant="italic">μ</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">Σ</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> denotes a set of
distribution parameter.</p>
      <p>A GMM is, in essence, an aggregation of several multivariate normal
distributions; hence, its PDF for each data sample is computed as a weighted
summation of Gaussian distributions (see Fig. 4):
              <disp-formula id="Ch1.E4" content-type="numbered"><mml:math id="M81" display="block"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="normal">Θ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mi>N</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M82" display="inline"><mml:mrow><mml:mi mathvariant="normal">Θ</mml:mi><mml:mo>=</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>. <inline-formula><mml:math id="M83" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> is called the
mixing coefficients of <inline-formula><mml:math id="M84" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> Gaussian components and <inline-formula><mml:math id="M85" display="inline"><mml:mrow><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula>.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F5"><caption><p>Structure of a Gaussian mixture model.</p></caption>
            <?xmltex \igopts{width=236.157874pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f04.pdf"/>

          </fig>

      <p>Accordingly, the PDF for all data samples can be expressed as follows (Ju and
Liu, 2012):
              <disp-formula id="Ch1.E5" content-type="numbered"><mml:math id="M86" display="block"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="normal">Θ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∏</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="normal">Θ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>L</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="normal">Θ</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p>Identifying a GMM's parameters <inline-formula><mml:math id="M87" display="inline"><mml:mi mathvariant="normal">Θ</mml:mi></mml:math></inline-formula> can be considered as an unsupervised
learning task within which a dataset of independently distributed data points
<inline-formula><mml:math id="M88" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>=</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>N</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>, generated from an integrated distribution dictated via
the PDF <inline-formula><mml:math id="M89" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="normal">Θ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The goal is to find the most appropriate value
of <inline-formula><mml:math id="M90" display="inline"><mml:mi mathvariant="normal">Θ</mml:mi></mml:math></inline-formula>, denoted as <inline-formula><mml:math id="M91" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="normal">Θ</mml:mi><mml:mi>e</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, that maximizes the log-likelihood
function:

                  <disp-formula specific-use="align" content-type="numbered"><mml:math id="M92" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E6"><mml:mtd/><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msub><mml:mi mathvariant="normal">Θ</mml:mi><mml:mi>e</mml:mi></mml:msub></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mo>=</mml:mo><mml:mi>arg⁡</mml:mi><mml:munder><mml:mo movablelimits="false">max⁡</mml:mo><mml:mi mathvariant="normal">Θ</mml:mi></mml:munder><mml:mi>log⁡</mml:mi><mml:mfenced close=")" open="("><mml:mi>L</mml:mi><mml:mfenced close=")" open="("><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">Θ</mml:mi></mml:mfenced></mml:mfenced><mml:mo>=</mml:mo><mml:mi>log⁡</mml:mi><mml:mfenced open="(" close=")"><mml:munderover><mml:mo movablelimits="false">∏</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mi>p</mml:mi><mml:mfenced close=")" open="("><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="normal">Θ</mml:mi></mml:mfenced></mml:mfenced></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mtr><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mi>log⁡</mml:mi><mml:mfenced close=")" open="("><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mfenced open="(" close=")"><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mfenced></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
      <p>Practically, instead of dealing with the log-likelihood function, an
equivalent objective function <inline-formula><mml:math id="M93" display="inline"><mml:mi>Q</mml:mi></mml:math></inline-formula> is optimized (Ju and Liu, 2012).
              <disp-formula id="Ch1.E7" content-type="numbered"><mml:math id="M94" display="block"><mml:mrow><mml:mtext>Max.</mml:mtext><mml:mi>Q</mml:mi><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mi>log⁡</mml:mi><mml:mo>[</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>]</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M95" display="inline"><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is a posteriori probability for the <inline-formula><mml:math id="M96" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th class, <inline-formula><mml:math id="M97" display="inline"><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:math></inline-formula>, and
<inline-formula><mml:math id="M98" display="inline"><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> satisfies the following conditions:
              <disp-formula id="Ch1.E8" content-type="numbered"><mml:math id="M99" display="block"><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>s</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:msub><mml:mi>p</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mo>;</mml:mo><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mspace width="0.25em" linebreak="nobreak"/><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p>In order to compute <inline-formula><mml:math id="M100" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="normal">Θ</mml:mi><mml:mi>e</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> in Eq. (6), the Expectation Maximization (EM)
algorithm is employed. In addition, an unsupervised learning approach
proposed by Figueiredo and Jain (2002) is used for determining <inline-formula><mml:math id="M101" display="inline"><mml:mi mathvariant="normal">Θ</mml:mi></mml:math></inline-formula>.
These two algorithms are briefly reviewed in the next section of the paper.</p>
</sec>
<sec id="Ch1.S3.SS3.SSS2">
  <title>Learning of the finite-mixture model with the expectation
maximization algorithm</title>
      <p>The expectation maximization (EM) method is a statistical approach to fit a
GMM based on historical data; this method converges to a maximum likelihood
estimate of model parameters (McLachlan and Krishnan, 2008). It can be
recapitulated as follows (McLachlan and Peel, 2000). Commencing from an
initial parameter <inline-formula><mml:math id="M102" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="normal">Θ</mml:mi><mml:mi>o</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, an iteration of the EM algorithm consists of
the E step in which the current conditional probabilities <inline-formula><mml:math id="M103" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>N</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mi mathvariant="normal">|</mml:mi><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="normal">Σ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> that <inline-formula><mml:math id="M104" display="inline"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>
generated from the <inline-formula><mml:math id="M105" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th mixture component are calculated, and the
M step within which the maximum likelihood estimates of <inline-formula><mml:math id="M106" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>
are updated. The iteration of EM algorithm terminates when the change value
of the objective function is lower than a threshold value.</p>
      <p>These two steps of the EM procedure are stated as follows: (i) E step:
estimating the expected classes of all data samples for each
class <inline-formula><mml:math id="M107" display="inline"><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> based on Eq. (8) and (ii) M step, calculating
maximum likelihood given the data's class membership distribution using the
following equations:

                  <disp-formula specific-use="align" content-type="numbered"><mml:math id="M108" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E9"><mml:mtd/><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msubsup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi><mml:mtext>new</mml:mtext></mml:msubsup><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>n</mml:mi></mml:mfrac></mml:mstyle><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E10"><mml:mtd/><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msubsup><mml:mi mathvariant="italic">μ</mml:mi><mml:mi>i</mml:mi><mml:mtext>new</mml:mtext></mml:msubsup><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>/</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E11"><mml:mtd/><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msubsup><mml:mi mathvariant="normal">Σ</mml:mi><mml:mi>i</mml:mi><mml:mtext>new</mml:mtext></mml:msubsup><mml:mo>=</mml:mo><mml:mfenced open="(" close=")"><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mfenced close=")" open="("><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msubsup><mml:mi mathvariant="italic">μ</mml:mi><mml:mi>i</mml:mi><mml:mtext>new</mml:mtext></mml:msubsup></mml:mfenced><mml:msup><mml:mfenced open="(" close=")"><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msubsup><mml:mi mathvariant="italic">μ</mml:mi><mml:mi>i</mml:mi><mml:mtext>new</mml:mtext></mml:msubsup></mml:mfenced><mml:mi>T</mml:mi></mml:msup></mml:mfenced><mml:mo>/</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
</sec>
<sec id="Ch1.S3.SS3.SSS3">
  <title>Unsupervised learning of finite-mixture model</title>
      <p>The EM algorithm increases the log-likelihood iteratively until convergence
is detected, and this approach generally can derive a good set of estimated
parameters. Nonetheless, EM suffers from low convergence speed in some datasets, high sensitivity to initialization condition, and suboptimal estimated
solutions (Biernacki et al., 2003). Moreover, additional efforts are required
to determine an appropriate number of Gaussian distributions within the
mixture.</p>
      <p>As an attempt to alleviate such drawbacks of EM, Figueiredo and Jain (2002)
put forward an unsupervised algorithm for learning a GMM from multivariate
data. The algorithm features the capability of identifying a suitable number
of Gaussian components autonomously, and through experiments the authors show
that the algorithm is not sensitive to initialization. In other words, this
unsupervised approach incorporates the tasks of model estimation and model
selection in a unified algorithm. Generally, this method can initiate with a
large number of components. The initial values for component means can be
assigned to all data points in the training set; in an extreme case, it is
possible to distribute the component number equal to the data point number.
This algorithm gradually fine-tunes the number of mixture components by
casting out elements of normal distributions that are irrelevant for the data
modeling process (Paalanen, 2004).</p>
      <p>Furthermore, Figueiredo and Jain (2002) employed the minimum message length
(MML) criterion (Wallace and Dowe, 1999) as an index for model selection; the
application of this criterion for the case of GMM learning leads to the
following objective function (Figueiredo and Jain, 2002):

                  <disp-formula specific-use="align" content-type="numbered"><mml:math id="M109" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E12"><mml:mtd/><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi mathvariant="normal">Ω</mml:mi><mml:mfenced open="(" close=")"><mml:mi mathvariant="normal">Θ</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="bold-italic">X</mml:mi></mml:mfenced><mml:mo>=</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mtr><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi>N</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>:</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&gt;</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow><mml:mrow><mml:mi>ln⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mfrac><mml:mrow><mml:mi>n</mml:mi><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mn mathvariant="normal">12</mml:mn></mml:mfrac></mml:mfenced><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mtext>nz</mml:mtext></mml:msub></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mrow></mml:munderover><mml:mi>ln⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi>n</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:mfrac></mml:mstyle></mml:mfenced><mml:mo>+</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mtext>nz</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:mo>-</mml:mo><mml:mi>ln⁡</mml:mi><mml:mi>L</mml:mi><mml:mfenced open="(" close=")"><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">Θ</mml:mi></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>

              where <inline-formula><mml:math id="M110" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> denotes the size of the training set, <inline-formula><mml:math id="M111" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> represents the number of
hyper-parameters needed to construct a Gaussian distribution, and <inline-formula><mml:math id="M112" display="inline"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mtext>nz</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>
is the number of Gaussian distribution components featuring nonzero weight
(<inline-formula><mml:math id="M113" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&gt;</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Accordingly, the EM method is then utilized to minimize
Eq. (12) with a fixed number of <inline-formula><mml:math id="M114" display="inline"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mtext>nz</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>.</p>
      <p>In detail, the EM algorithm is employed to estimate <inline-formula><mml:math id="M115" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> as
follows:
              <disp-formula id="Ch1.E13" content-type="numbered"><mml:math id="M116" display="block"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi><mml:mtext>new</mml:mtext></mml:msubsup><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>max⁡</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mfenced close=")" open="("><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mfenced><mml:mo>-</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mi>N</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:mo mathvariant="italic">}</mml:mo></mml:mrow><mml:mrow><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:mo>max⁡</mml:mo><mml:mfenced close="}" open="{"><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mfenced close=")" open="("><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>j</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mfenced><mml:mo>-</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mi>N</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle></mml:mfenced></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p>Accordingly, the parameters <inline-formula><mml:math id="M117" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">μ</mml:mi><mml:mi>i</mml:mi><mml:mtext>new</mml:mtext></mml:msubsup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M118" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="normal">Σ</mml:mi><mml:mi>i</mml:mi><mml:mtext>new</mml:mtext></mml:msubsup></mml:mrow></mml:math></inline-formula> are
updated based on Eqs. (10) and (11), respectively. The algorithm stops when the
relative decrease in the objective function <inline-formula><mml:math id="M119" display="inline"><mml:mrow><mml:mi mathvariant="normal">Ω</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="normal">Θ</mml:mi><mml:mi mathvariant="normal">|</mml:mi><mml:mi mathvariant="bold-italic">X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> becomes
smaller than a preset threshold (e.g., 10<inline-formula><mml:math id="M120" display="inline"><mml:mrow><mml:msup><mml:mi/><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F6" specific-use="star"><caption><p>The established GIS database.</p></caption>
            <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f05.pdf"/>

          </fig>

</sec>
</sec>
<sec id="Ch1.S3.SS4">
  <title>Radial-basis-function Fisher discriminant analysis for
generation of latent variables</title>
      <p>In machine learning, the performance of a model may be enhanced if latent
variables are used (Yu, 2011). Therefore, latent variable approach is
employed in this research. Accordingly, radial-basis-function Fisher
discriminant analysis (RBFDA) proposed Mika et al. (1999), an extension of
the Fisher Discriminant Analysis for dealing with data nonlinearity, is used
to generate a latent factor for flood analysis. Thus, RBFDA is utilized to
project the feature from the original learning space to a projected space
that expresses a high degree of class reparability (Theodoridis and
Koutroumbas, 2009). Using this kernel technique, the data from an input space
<inline-formula><mml:math id="M121" display="inline"><mml:mi mathvariant="bold">I</mml:mi></mml:math></inline-formula> is first mapped into a high-dimensional feature space <inline-formula><mml:math id="M122" display="inline"><mml:mi>F</mml:mi></mml:math></inline-formula>. Hence,
discriminant analysis tasks can be performed nonlinearly in <inline-formula><mml:math id="M123" display="inline"><mml:mi mathvariant="bold">I</mml:mi></mml:math></inline-formula>.</p>
      <p>Herein, <inline-formula><mml:math id="M124" display="inline"><mml:mrow><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>(</mml:mo><mml:mo>.</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is defined as a transformation from an input space <inline-formula><mml:math id="M125" display="inline"><mml:mi mathvariant="bold">I</mml:mi></mml:math></inline-formula>
to a high-dimensional feature space <inline-formula><mml:math id="M126" display="inline"><mml:mi>F</mml:mi></mml:math></inline-formula>; to compute <inline-formula><mml:math id="M127" display="inline"><mml:mi mathvariant="bold-italic">w</mml:mi></mml:math></inline-formula> (the projecting
vector), it is necessary to maximize the Fisher discriminant ratio as
follows:

                <disp-formula specific-use="align" content-type="numbered"><mml:math id="M128" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E14"><mml:mtd/><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi>J</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msubsup><mml:mi>S</mml:mi><mml:mi>B</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">w</mml:mi></mml:mrow><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msubsup><mml:mi>S</mml:mi><mml:mi>W</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">w</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E15"><mml:mtd/><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtext>where</mml:mtext></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msubsup><mml:mi>S</mml:mi><mml:mi>B</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mfenced close=")" open="("><mml:msubsup><mml:mi>m</mml:mi><mml:mn mathvariant="normal">1</mml:mn><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup><mml:mo>-</mml:mo><mml:msubsup><mml:mi>m</mml:mi><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup></mml:mfenced><mml:msup><mml:mfenced open="(" close=")"><mml:msubsup><mml:mi>m</mml:mi><mml:mn mathvariant="normal">1</mml:mn><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup><mml:mo>-</mml:mo><mml:msubsup><mml:mi>m</mml:mi><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup></mml:mfenced><mml:mi>T</mml:mi></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E16"><mml:mtd/><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msubsup><mml:mi>S</mml:mi><mml:mi>W</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>C</mml:mi></mml:munderover><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:munderover><mml:mfenced open="(" close=")"><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:msubsup><mml:mi>m</mml:mi><mml:mi>k</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup></mml:mfenced><mml:msup><mml:mfenced open="(" close=")"><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:msubsup><mml:mi>m</mml:mi><mml:mi>k</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup></mml:mfenced><mml:mi>T</mml:mi></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E17"><mml:mtd/><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msubsup><mml:mi>m</mml:mi><mml:mi>k</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>N</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:munderover><mml:mi mathvariant="italic">φ</mml:mi><mml:mfenced close=")" open="("><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>k</mml:mi></mml:msubsup></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p>To obtain <inline-formula><mml:math id="M129" display="inline"><mml:mi mathvariant="bold-italic">w</mml:mi></mml:math></inline-formula>, the kernel trick is applied. Thus, one only needs to establish a
formulation of the algorithm which only requires dot-product <inline-formula><mml:math id="M130" display="inline"><mml:mrow><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo><mml:mo>⋅</mml:mo><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>(</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> of the training data and employ kernel functions which
calculate <inline-formula><mml:math id="M131" display="inline"><mml:mrow><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo><mml:mo>⋅</mml:mo><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>(</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The widely employed radial-basis kernel
function (RBKF) is expressed in the following formula (with <inline-formula><mml:math id="M132" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula>
denoting the kernel function bandwidth):

                <disp-formula id="Ch1.E18" content-type="numbered"><mml:math id="M133" display="block"><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi>K</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>exp⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msup><mml:mfenced open="∥" close="∥"><mml:mi>x</mml:mi><mml:mo>-</mml:mo><mml:mi>y</mml:mi></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:msup><mml:mi mathvariant="italic">σ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p>Since a solution of the vector <inline-formula><mml:math id="M134" display="inline"><mml:mi mathvariant="bold-italic">w</mml:mi></mml:math></inline-formula> lies in the span of all data samples in
the projected space, the transformation vector <inline-formula><mml:math id="M135" display="inline"><mml:mi mathvariant="bold-italic">w</mml:mi></mml:math></inline-formula> is shown in the following
formula:

                <disp-formula id="Ch1.E19" content-type="numbered"><mml:math id="M136" display="block"><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>

          <?xmltex \hack{\newpage}?></p>
      <p><?xmltex \hack{\noindent}?>From Eqs. (17) and (19), we have the following:

                <disp-formula specific-use="align" content-type="numbered"><mml:math id="M137" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E20"><mml:mtd/><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msup><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msubsup><mml:mi>m</mml:mi><mml:mi>k</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>N</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mi>k</mml:mi><mml:mfenced open="(" close=")"><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>k</mml:mi></mml:msubsup></mml:mfenced><mml:mo>=</mml:mo><mml:msup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msub><mml:mi>M</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mtr><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msub><mml:mi>M</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>N</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:munderover><mml:mi>k</mml:mi><mml:mfenced open="(" close=")"><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mi>k</mml:mi></mml:msubsup></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
      <p>Taking into account the formulas of <inline-formula><mml:math id="M138" display="inline"><mml:mrow><mml:mi>J</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M139" display="inline"><mml:mrow><mml:msubsup><mml:mi>S</mml:mi><mml:mi>B</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>, as well as
Eq. (20), we can restate the numerator of Eq. (14) in the following manner:

                <disp-formula specific-use="align" content-type="numbered"><mml:math id="M140" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E21"><mml:mtd/><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msup><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msubsup><mml:mi>S</mml:mi><mml:mi>B</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mi>M</mml:mi><mml:mi mathvariant="italic">α</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mtr><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtext>where</mml:mtext><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mi>M</mml:mi><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>)</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
      <p>Based on the Eq. (17) that defines <inline-formula><mml:math id="M141" display="inline"><mml:mrow><mml:msubsup><mml:mi>m</mml:mi><mml:mi>k</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>, the denominator of
Eq. (14) can be demonstrated in the following way:

                <disp-formula id="Ch1.E22" content-type="numbered"><mml:math id="M142" display="block"><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msup><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msubsup><mml:mi>S</mml:mi><mml:mi>W</mml:mi><mml:mi mathvariant="italic">φ</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mi>N</mml:mi><mml:mi mathvariant="italic">α</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

          where <inline-formula><mml:math id="M143" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:munderover><mml:msub><mml:mi mathvariant="bold">K</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold">I</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:msub><mml:mi>l</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:msubsup><mml:mi mathvariant="bold">K</mml:mi><mml:mi>k</mml:mi><mml:mi>T</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>,
<inline-formula><mml:math id="M144" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">K</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denotes a <inline-formula><mml:math id="M145" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> kernel matrix with a typical element is
<inline-formula><mml:math id="M146" display="inline"><mml:mrow><mml:mi>k</mml:mi><mml:mfenced open="(" close=")"><mml:msub><mml:mi>x</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mi>m</mml:mi><mml:mi>k</mml:mi></mml:msubsup></mml:mfenced></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M147" display="inline"><mml:mi mathvariant="bold">I</mml:mi></mml:math></inline-formula> represents the identity matrix and <inline-formula><mml:math id="M148" display="inline"><mml:mrow><mml:msub><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:msub><mml:mi>l</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is
a matrix within which all positions are <inline-formula><mml:math id="M149" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:msub><mml:mi>l</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>.</p>
      <p>Considering all Eqs. (14), (21), and (22), the solution of RBFDA can
be found by maximizing the following:

                <disp-formula id="Ch1.E23" content-type="numbered"><mml:math id="M150" display="block"><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi>J</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">α</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mfenced open="(" close=")"><mml:msup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mi>M</mml:mi><mml:mi mathvariant="italic">α</mml:mi></mml:mfenced></mml:mrow><mml:mrow><mml:mfenced close=")" open="("><mml:msup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mi>N</mml:mi><mml:mi mathvariant="italic">α</mml:mi></mml:mfenced></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p>The optimization problem with the objective function expressed in Eq. (23)
is found by identifying the primal eigenvector of <inline-formula><mml:math id="M151" display="inline"><mml:mrow><mml:msup><mml:mi>N</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mi>M</mml:mi></mml:mrow></mml:math></inline-formula>. Based on the
optimization results, an input patter in <inline-formula><mml:math id="M152" display="inline"><mml:mi mathvariant="bold">I</mml:mi></mml:math></inline-formula> is projected on to a line defined
by the vector <inline-formula><mml:math id="M153" display="inline"><mml:mi mathvariant="bold-italic">w</mml:mi></mml:math></inline-formula> in the following manner:

                <disp-formula id="Ch1.E24" content-type="numbered"><mml:math id="M154" display="block"><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mo>⋅</mml:mo><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mi>k</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F7" specific-use="star"><caption><p>The proposed BayGmmKda.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f06.pdf"/>

        </fig>

</sec>
</sec>
<sec id="Ch1.S4">
  <title>The proposed Bayesian framework for flood susceptibility prediction</title>
<sec id="Ch1.S4.SS1">
  <title>The established GIS database</title>
      <p>To formulate a flood assessment model, the first stage is to construct a GIS
database (see Fig. 5) within which locations of past flood events, maps of
topographic feature, Landsat-8 imagery, maps of geological features, and
precipitation statistical records are acquired and integrated. In this
study, the data acquisition, processing, and integration were performed with
ArcGIS (version 10.2) and IDRISI Selva (version 17.01) software packages.</p>
      <p>Furthermore, a C<inline-formula><mml:math id="M155" display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mo>+</mml:mo></mml:mrow></mml:math></inline-formula> application has been developed by the authors to
transform the flood susceptibility indices into a GIS format for ArcGIS
implementation. Accordingly, the compiled outcomes are employed to form a
database that includes the aforementioned flood-influencing features with
two class outputs: flood and nonflood. As mentioned earlier, a
total of 76 flood locations have been recorded. To balance the dataset and
reliably construct the flood prediction model, 76 locations of nonflood
areas are randomly sampled and included for analysis. Hence, the total
database consists of 152 data samples.</p>
</sec>
<sec id="Ch1.S4.SS2">
  <title>The proposed model structure</title>
      <p>The proposed model for flood susceptibility assessment that incorporates
RBFDA, the Bayesian classification framework, and GMM is presented in this
section of the study. The overall flowchart of the proposed Bayesian
framework based on GMM and RBFDA for flood susceptibility prediction, named
as BayGmmKda, is demonstrated in Fig. 6.</p>
      <p>Firstly, the whole dataset, including 152 data samples, was separated into
two sets: a training set (90 % or 137 samples), employed for model
establishing, and a testing set (10 % or 15 samples), used for model testing.
It is noted that the input variables of the dataset have been normalized
using the minimum–maximum normalization; the purpose of data normalization was to
hedge against the situation of unbalanced variable magnitudes.</p>
      <p>Secondly, a latent input factor was generated using the RBFDA (explained in
Sect. 3.4) and added to the training dataset, with the aim of enhancing the
classification performance. Subsequently, the feature evaluation was
performed to quantify the degree of relevance of each input factors with the
flood inventories in the training set. Any nonrelevant factor should be
eliminated from the modeling process to reduce noise and enhance the model
performance (Tien Bui et al., 2016a, 2017). For this purpose, in this
research, the Mutual Information Criterion (Kwak and Choi, 2002; Hoang et
al., 2016), a widely employed techniques for feature selection in machine
learning, was selected to express the pertinence of each influencing factors
to the flood. It is noticed that the larger the mutual information, the
stronger the relevancy between the influencing factor and flood.</p>
      <p>In the next step, the BayGmmKda model was trained and established using the
training set. The purpose of the training process was to find the best
parameters for the mixture component (<inline-formula><mml:math id="M156" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>) used in GMM and the kernel
function bandwidth (<inline-formula><mml:math id="M157" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula>) used in RBFDA of the BayGmmKda model. To
determine the best <inline-formula><mml:math id="M158" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>, the EM algorithm that employs Akaike information
criterion (AIC; Akaike, 1974) was used. Thus, the value of <inline-formula><mml:math id="M159" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> was varied
from 1 to 20, and then AIC was estimated and used to select the model that
exhibits the best fit to the data at hand. It is noted that a model with a
number of mixture components (<inline-formula><mml:math id="M160" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>) indicates a lesser degree of complexity
(Olivier et al., 1999). In addition, the unsupervised GMM learning
(Figueiredo and Jain, 2002) is also used for autonomously determining the
best <inline-formula><mml:math id="M161" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>. Accordingly, the model starts with a maximum component number (<inline-formula><mml:math id="M162" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>)
of 20; the algorithm carries out the model selection process by removing
irrelevant mixture components if applicable. To determine the best <inline-formula><mml:math id="M163" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula>,
the grid search procedure is performed and the parameter <inline-formula><mml:math id="M164" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula>
corresponding to the highest classification accuracy rate was selected.</p>
      <p>Using the best <inline-formula><mml:math id="M165" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M166" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula> in the previous step, the final BayGmmKda
model was finally constructed and the Bayesian classification framework was
derived. The Bayesian framework was then used to estimate the posterior
probability (flood susceptibility index) for all the pixels in the study
areas. The flood susceptibility index was then transferred to a raster format
to open in ArcGIS.</p>
</sec>
<sec id="Ch1.S4.SS3">
  <title>The developed MATLAB interface of BayGmmKda</title>
      <p>It is noted that the coupling of the GMM with the EM training algorithm is implemented with the
MATLAB statistical toolbox (MathWorks, 2012a); meanwhile, the BayGmmKda
performs the unsupervised algorithm with the program code provided by
Mário A. T. Figueiredo (<uri>http://www.lx.it.pt/~mtf/</uri>, last access:
1 April 2016). The RBFDA algorithm and the unified BayGmmKda model have been
coded in MATLAB by the authors. In addition, a software program with a
graphical user interface (GUI; see Fig. 7) for the implementation of
the BayGmmKda model has been coded in a MATLAB environment by the authors. The
GUI development aims at providing a user-friendly system for performing flood
susceptibility predictions.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F8" specific-use="star"><caption><p>Main menu of BayGmmKda.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f07.png"/>

        </fig>

      <p>As shown in Fig. 7, the program consists of three modules: data process
and visualization, model training, and model prediction. The first module
provides basic functions for data inspection and visualization, including
data normalization, data viewing, and preliminary feature selection with
mutual information. In the second module, the users simply provide model
parameters, including the kernel function parameter and the GMM training
method. The trained model is employed to carry out prediction tasks in the
third module, within which the model prediction performance is reported.</p>
</sec>
</sec>
<sec id="Ch1.S5">
  <title>Experimental results</title>
<sec id="Ch1.S5.SS1">
  <title>Feature selection and training of the BayGmmKda model</title>
      <p>The outcome of the preliminary examination on the pertinence of flood-influencing factors is reported in Fig. 8a. As mentioned earlier, the
relevancies of influencing factors are exhibited by the mutual information
criterion. Based on the outcome, IF<inline-formula><mml:math id="M167" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">5</mml:mn></mml:msub></mml:math></inline-formula> (SPI) features the highest mutual
dependence, followed by IF<inline-formula><mml:math id="M168" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">7</mml:mn></mml:msub></mml:math></inline-formula> (stream density) and IF<inline-formula><mml:math id="M169" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">8</mml:mn></mml:msub></mml:math></inline-formula> (NVDI).
Influencing factors of IF<inline-formula><mml:math id="M170" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:math></inline-formula> (TWI) and IF<inline-formula><mml:math id="M171" display="inline"><mml:msub><mml:mi/><mml:mn mathvariant="normal">10</mml:mn></mml:msub></mml:math></inline-formula> (rainfall) exhibit
comparatively low values of mutual information. Because all the mutual
information values are not null, all influencing factors are deemed to be relevant
and should be retained for the subsequent processes of model training and
prediction.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F9" specific-use="star"><caption><p><bold>(a)</bold> Mutual information of flood-influencing factors; <bold>(b)</bold> RBFDA-based
latent factor derived in this study.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f08.png"/>

        </fig>

      <p>It is worth keeping in mind that the BayGmmKda's training phase is executed in two
consecutive steps, training RBFDA and training GMM. RBFDA analyzes the data
in the training set to establish a latent factor which is a one-dimensional
representation of the original input pattern. Figure 8b shows the resulted
latent factor constructed by RBFDA. In the next step of the training phase,
GMM is constructed by the original input patterns with their corresponding
labels which consist of 10 input factors and with the RBFDA-based latent
factor.</p>
      <p>The classification accuracy rate (CAR) is employed to exhibit the rate of
correctly classified instances. In addition, a more detailed analysis on the
model capability can be presented by calculating true positive rate (TPR),
false positive rate (FPR), false negative rate (FNR), and true negative rate
(TNR). These four rates are also widely utilized to exhibit the predictive
capability of a prediction model (Hoang and Tien-Bui, 2016).

                <disp-formula specific-use="align" content-type="numbered"><mml:math id="M172" display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mtext>TPR</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mtext>TP</mml:mtext><mml:mrow><mml:mtext>TP</mml:mtext><mml:mo>+</mml:mo><mml:mtext>FN</mml:mtext></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mlabeledtr id="Ch1.E25"><mml:mtd/><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mtext>FPR</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mtext>FP</mml:mtext><mml:mrow><mml:mtext>FP</mml:mtext><mml:mo>+</mml:mo><mml:mtext>TN</mml:mtext></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mtr><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtext>FNR</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mtext>FN</mml:mtext><mml:mrow><mml:mtext>TP</mml:mtext><mml:mo>+</mml:mo><mml:mtext>FN</mml:mtext></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mstyle class="stylechange" displaystyle="true"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mtext>TNR</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mtext>TN</mml:mtext><mml:mrow><mml:mtext>TN</mml:mtext><mml:mo>+</mml:mo><mml:mtext>FP</mml:mtext></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>

            where TP, TN, FP, and FN represent the values of true positive, true
negative, false positive, and false negative, respectively.</p>
      <p>In addition to the four rates, the receiver operating characteristic (ROC)
curve (van Erkel and Pattynama, 1998) is used to summarize the global
performance of the model. The ROC curve basically demonstrates the trade-off
between the two aforementioned TPR and FPR, when the threshold for accepting
the positive class of flood varies. In addition, the area under the ROC
curve (AUC) is employed to quantify the global performance. In generally, a
better model is characterized by a larger value of the AUC.</p>
      <p>As aforementioned, the dataset is randomly separated into the training set
and the testing set which occupy 90 and 10 % of the data samples,
respectively. The training set is employed to train the mode; meanwhile, the
testing set is used for validating the model capability after being trained.
Since one selection of data for the training set and the testing set may not
truly demonstrate the model's predictive capability, this study carries out a
repetitive subsampling procedure within which 30 experimental runs are
carried out. In each experimental run, 10 % of the dataset is retrieved
in a random manner from the database to constitute the testing set; the rest of
the database is included in the training set.<?xmltex \hack{\newpage}?></p>
      <p>The testing performance of the proposed Bayesian framework for flood
susceptibility is reported in Table 2 and Fig. 9, which provides the
average ROC curves of the proposed model framework, obtained from the random
subsampling process, with two methods of GMM training. Herein, the two
Bayesian models that employ the EM algorithm and the unsupervised learning (UL)
algorithm for training GMM are denoted as BayGmmKda-EM and BayGmmKda-UL,
respectively. It can be seen that the BayGmmKda-UL model demonstrates clearly
better predictive performance (CAR <inline-formula><mml:math id="M173" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 89.58 %, AUC <inline-formula><mml:math id="M174" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.94,
TPR <inline-formula><mml:math id="M175" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula>  0.96, TNR <inline-formula><mml:math id="M176" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.91) than that of the BayGmmKda-EM model
(CAR <inline-formula><mml:math id="M177" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 86.67 %, AUC <inline-formula><mml:math id="M178" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.93, TPR <inline-formula><mml:math id="M179" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.95, TNR <inline-formula><mml:math id="M180" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.85).
Although the performances of the BayGmmKda-EM model and the BayGmmKda-UL
model are comparable in TPR, however, the BayGmmKda-UL model is deemed more
accurate than the BayGmmKda-EM model when the two models predict samples with
the nonflood class.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F10" specific-use="star"><caption><p>ROC plots of the proposed BayGmmKda.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f09.png"/>

        </fig>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T2"><caption><p>Prediction results of BayGmmKda.</p></caption><oasis:table frame="topbot"><?xmltex \begin{scaleboxenv}{.85}[.85]?><oasis:tgroup cols="7">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:colspec colnum="7" colname="col7" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Dataset</oasis:entry>  
         <oasis:entry colname="col2">CAR (%)</oasis:entry>  
         <oasis:entry colname="col3">AUC</oasis:entry>  
         <oasis:entry colname="col4">TPR</oasis:entry>  
         <oasis:entry colname="col5">FPR</oasis:entry>  
         <oasis:entry colname="col6">FNR</oasis:entry>  
         <oasis:entry colname="col7">TNR</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">  
         <oasis:entry namest="col1" nameend="col7">Average </oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">BayGmmKda-EM</oasis:entry>  
         <oasis:entry colname="col2">86.67</oasis:entry>  
         <oasis:entry colname="col3">0.93</oasis:entry>  
         <oasis:entry colname="col4">0.95</oasis:entry>  
         <oasis:entry colname="col5">0.12</oasis:entry>  
         <oasis:entry colname="col6">0.15</oasis:entry>  
         <oasis:entry colname="col7">0.85</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">BayGmmKda-UL</oasis:entry>  
         <oasis:entry colname="col2">89.58</oasis:entry>  
         <oasis:entry colname="col3">0.94</oasis:entry>  
         <oasis:entry colname="col4">0.96</oasis:entry>  
         <oasis:entry colname="col5">0.12</oasis:entry>  
         <oasis:entry colname="col6">0.09</oasis:entry>  
         <oasis:entry colname="col7">0.91</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry namest="col1" nameend="col7">Standard deviation </oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">BayGmmKda-EM</oasis:entry>  
         <oasis:entry colname="col2">6.51</oasis:entry>  
         <oasis:entry colname="col3">0.07</oasis:entry>  
         <oasis:entry colname="col4">0.05</oasis:entry>  
         <oasis:entry colname="col5">0.10</oasis:entry>  
         <oasis:entry colname="col6">0.12</oasis:entry>  
         <oasis:entry colname="col7">0.12</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">BayGmmKda-UL</oasis:entry>  
         <oasis:entry colname="col2">7.22</oasis:entry>  
         <oasis:entry colname="col3">0.05</oasis:entry>  
         <oasis:entry colname="col4">0.04</oasis:entry>  
         <oasis:entry colname="col5">0.11</oasis:entry>  
         <oasis:entry colname="col6">0.10</oasis:entry>  
         <oasis:entry colname="col7">0.10</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup><?xmltex \end{scaleboxenv}?></oasis:table></table-wrap>

</sec>
<sec id="Ch1.S5.SS2">
  <title>Model comparison</title>
      <p>Because this is the first time the BayGmmKda model has been proposed for the measurement flood
susceptibility, the validity of the proposed model should be assessed. Hence,
the benchmarks were used for the comparison, including the support vector machine, adaptive neuro-fuzzy inference system, and the GMM-based
Bayesian classifier. The above machine learning techniques were selected
because SVM and ANFIS have been recently verified to be effective tools for
predicting flood susceptibility (Tien Bui et al., 2016c; Tehrany et al.,
2015b). It is noted that the GMM-based Bayesian classifier (BayGmm) is the
Bayesian framework for classification which employs GMM for density
estimation; however, BayGmm is not integrated with the RBFDA algorithm.
BayGmm is used in the performance comparison section to confirm the advantage
of the newly constructed BayGmmKda and to verify the usefulness of RBFDA in
enhancing the discriminative capability of the hybrid framework.</p>
      <p>To construct the SVM model, the model's hyperparameters of the regularization
constant (<inline-formula><mml:math id="M181" display="inline"><mml:mi>C</mml:mi></mml:math></inline-formula>) and the parameter of the radial-basis kernel function
(<inline-formula><mml:math id="M182" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula>) need to be specified. Herein, a grid search process, which is
identical to the one used to identify the kernel function bandwidth used in
RBFDA, is employed to fine-tune such hyperparameters of the SVM model. It is
noted that the SVM method is implemented in a MATLAB package (MathWorks, 2012b).
Meanwhile, the ANFIS model is trained with the metaheuristic approach
described in the previous work of Tien Bui et al. (2016c).</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T3"><caption><p>Performance comparison of the BayGmmKda model with the three
benchmarks, the SVM model, the ANFIS model, and the BayGmm model.</p></caption><oasis:table frame="topbot"><?xmltex \begin{scaleboxenv}{.95}[.95]?><oasis:tgroup cols="7">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:colspec colnum="7" colname="col7" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">Models</oasis:entry>  
         <oasis:entry colname="col2">CAR (%)</oasis:entry>  
         <oasis:entry colname="col3">AUC</oasis:entry>  
         <oasis:entry colname="col4">TPR</oasis:entry>  
         <oasis:entry colname="col5">FPR</oasis:entry>  
         <oasis:entry colname="col6">FNR</oasis:entry>  
         <oasis:entry colname="col7">TNR</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">  
         <oasis:entry namest="col1" nameend="col7">Average </oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">BayGmmKda</oasis:entry>  
         <oasis:entry colname="col2">89.58</oasis:entry>  
         <oasis:entry colname="col3">0.94</oasis:entry>  
         <oasis:entry colname="col4">0.96</oasis:entry>  
         <oasis:entry colname="col5">0.12</oasis:entry>  
         <oasis:entry colname="col6">0.09</oasis:entry>  
         <oasis:entry colname="col7">0.91</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">ANFIS</oasis:entry>  
         <oasis:entry colname="col2">85.63</oasis:entry>  
         <oasis:entry colname="col3">0.83</oasis:entry>  
         <oasis:entry colname="col4">0.84</oasis:entry>  
         <oasis:entry colname="col5">0.13</oasis:entry>  
         <oasis:entry colname="col6">0.16</oasis:entry>  
         <oasis:entry colname="col7">0.87</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">BayGmm</oasis:entry>  
         <oasis:entry colname="col2">85.02</oasis:entry>  
         <oasis:entry colname="col3">0.92</oasis:entry>  
         <oasis:entry colname="col4">0.82</oasis:entry>  
         <oasis:entry colname="col5">0.13</oasis:entry>  
         <oasis:entry colname="col6">0.17</oasis:entry>  
         <oasis:entry colname="col7">0.88</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">SVM</oasis:entry>  
         <oasis:entry colname="col2">83.75</oasis:entry>  
         <oasis:entry colname="col3">0.82</oasis:entry>  
         <oasis:entry colname="col4">0.78</oasis:entry>  
         <oasis:entry colname="col5">0.10</oasis:entry>  
         <oasis:entry colname="col6">0.22</oasis:entry>  
         <oasis:entry colname="col7">0.90</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry namest="col1" nameend="col7">Standard deviation </oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">BayGmmKda</oasis:entry>  
         <oasis:entry colname="col2">7.22</oasis:entry>  
         <oasis:entry colname="col3">0.05</oasis:entry>  
         <oasis:entry colname="col4">0.04</oasis:entry>  
         <oasis:entry colname="col5">0.11</oasis:entry>  
         <oasis:entry colname="col6">0.10</oasis:entry>  
         <oasis:entry colname="col7">0.10</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">ANFIS</oasis:entry>  
         <oasis:entry colname="col2">6.17</oasis:entry>  
         <oasis:entry colname="col3">0.05</oasis:entry>  
         <oasis:entry colname="col4">0.14</oasis:entry>  
         <oasis:entry colname="col5">0.10</oasis:entry>  
         <oasis:entry colname="col6">0.14</oasis:entry>  
         <oasis:entry colname="col7">0.10</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">BayGmm</oasis:entry>  
         <oasis:entry colname="col2">7.24</oasis:entry>  
         <oasis:entry colname="col3">0.08</oasis:entry>  
         <oasis:entry colname="col4">0.11</oasis:entry>  
         <oasis:entry colname="col5">0.10</oasis:entry>  
         <oasis:entry colname="col6">0.11</oasis:entry>  
         <oasis:entry colname="col7">0.10</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">SVM</oasis:entry>  
         <oasis:entry colname="col2">10.33</oasis:entry>  
         <oasis:entry colname="col3">0.06</oasis:entry>  
         <oasis:entry colname="col4">0.16</oasis:entry>  
         <oasis:entry colname="col5">0.11</oasis:entry>  
         <oasis:entry colname="col6">0.16</oasis:entry>  
         <oasis:entry colname="col7">0.11</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup><?xmltex \end{scaleboxenv}?></oasis:table></table-wrap>

      <p>It is noted that a random subsampling with 30 runs is employed for all models
in this experiment. The result comparison between the proposed BayGmmKda
model and three benchmark models is shown in Table 3. The result shows that
the proposed model yields the best results (CAR <inline-formula><mml:math id="M183" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 89.58 % and
AUC <inline-formula><mml:math id="M184" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.94). It is followed by the ANFIS model (CAR <inline-formula><mml:math id="M185" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 85.63 %,
AUC <inline-formula><mml:math id="M186" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.83); the BayGmm model (85.02 %, AUC <inline-formula><mml:math id="M187" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.92), and the SVM
model (83.75 %, AUC <inline-formula><mml:math id="M188" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.82).</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T4"><caption><p>Model comparison based on the Wilcoxon signed-rank test.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="5">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="center"/>
     <oasis:colspec colnum="3" colname="col3" align="center"/>
     <oasis:colspec colnum="4" colname="col4" align="center"/>
     <oasis:colspec colnum="5" colname="col5" align="center"/>
     <oasis:thead>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1"/>  
         <oasis:entry colname="col2">BayGmmKda</oasis:entry>  
         <oasis:entry colname="col3">ANFIS</oasis:entry>  
         <oasis:entry colname="col4">BayGmm</oasis:entry>  
         <oasis:entry colname="col5">SVM</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>  
         <oasis:entry colname="col1">BayGmmKda</oasis:entry>  
         <oasis:entry colname="col2"/>  
         <oasis:entry colname="col3"><inline-formula><mml:math id="M189" display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mo>+</mml:mo></mml:mrow></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4"><inline-formula><mml:math id="M190" display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mo>+</mml:mo></mml:mrow></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col5"><inline-formula><mml:math id="M191" display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mo>+</mml:mo></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">ANFIS</oasis:entry>  
         <oasis:entry colname="col2">- -</oasis:entry>  
         <oasis:entry colname="col3"/>  
         <oasis:entry colname="col4"><inline-formula><mml:math id="M192" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col5"><inline-formula><mml:math id="M193" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">BayGmm</oasis:entry>  
         <oasis:entry colname="col2">- -</oasis:entry>  
         <oasis:entry colname="col3">-</oasis:entry>  
         <oasis:entry colname="col4"/>  
         <oasis:entry colname="col5"><inline-formula><mml:math id="M194" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">SVM</oasis:entry>  
         <oasis:entry colname="col2">- -</oasis:entry>  
         <oasis:entry colname="col3">-</oasis:entry>  
         <oasis:entry colname="col4"/>  
         <oasis:entry colname="col5"/>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <?xmltex \floatpos{t}?><fig id="Ch1.F11" specific-use="star"><caption><p>The flood susceptibility map using the proposed BayGmmKda model for
the study area.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://gmd.copernicus.org/articles/10/3391/2017/gmd-10-3391-2017-f10.png"/>

        </fig>

      <p>To confirm the performance of the proposed BayGmmKda model is significantly
higher than that of the three benchmark model, the Wilcoxon signed-rank test
is employed. The Wilcoxon signed-rank test is widely used to evaluate whether
classification outcomes of prediction models are significantly dissimilar
(Tien Bui et al., 2016e). Using this test, the <inline-formula><mml:math id="M195" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula> values that were obtained from
experimental results of the four models can be computed using a threshold
value of 0.05. The result of the Wilcoxon signed-rank test is shown in Table 4.
It is noted that the signs “<inline-formula><mml:math id="M196" display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mo>+</mml:mo></mml:mrow></mml:math></inline-formula>”, “<inline-formula><mml:math id="M197" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula>”, “- -”, and “-”
represent a significant win, a win, a significant loss, and a loss,
respectively. The result confirms that the proposed BayGmmKda model achieves
significant wins over the other models.<?xmltex \hack{\newpage}?></p>
</sec>
<sec id="Ch1.S5.SS3">
  <title>Construction of the flood susceptibility map</title>
      <p>Experimental outcomes have indicated that the BayGmmKda model is the best for
this study area, and therefore the model was used to compute the posterior
probability for all the pixels of the study area. The posterior probability
values that were used as flood susceptibility indices were further
transformed to a raster format and open in ArcGIS 10.4 software package.
Using these indices, the flood susceptibility map (see Fig. 10) was derived
and visualized by the mean of five classes: very high (10 %), high
(10 %), moderate (10 %), low (20 %), and very low (50 %). The
threshold values for separating these classes were determined by overlaying
the historical flood locations and the flood susceptibility indices map (Tien
Bui et al., 2016c), and then a graphical curve (see Fig. 10) was
constructed and the threshold values were derived.</p>
      <p>Interpretation of the map shows that 10 % of the Tuong Duong district was
classified into the very high class and this class covers 73.68 % of the
total historical flood locations. Meanwhile, both the high class and the
moderate classes cover 10 % of the region but account for only
15.79 and 7.9 % of the total historical flood locations,
respectively, whereas the low class covers 20 % of the district but it
contains only 2.63 % of the total historical flood locations.
In particular, 50 % of the district, which is categorized to the very low
class, contains no flood location. These results indicate that the proposed BayGmmKda
model has successfully delineated susceptible flood-prone areas. In other
words, the interpretation results confirm the reliability of the proposed
Bayesian framework in this work.</p>
</sec>
</sec>
<sec id="Ch1.S6" sec-type="conclusions">
  <title>Conclusion</title>
      <p>This research has developed a new tool, named as BayGmmKda, for flood
susceptibility evaluation, with a case study in a high-frequency flood area
in central Vietnam. The newly constructed model is a Bayesian framework that
combines GMM and RBFDA for spatial prediction of flooding. A GIS database has
been established to train and test the BayGmmKda method. The training phase
of BayGmmKda consists of two steps: (i) discriminant analysis with RBFDA in
which a latent factor is generated and (ii) density estimation using GMM.
After the training phase, the Bayesian framework is employed to compute the
posterior probability. The posterior probability was then used as flood
susceptibility index. Furthermore, a MATLAB program with GUI has been
developed to ease the implementation of the BayGmmKda model in flood
vulnerability assessment.</p>
      <p>It is noted that in this study, the GMM training is performed with two
methods: the EM algorithm and the unsupervised learning approach.
Furthermore, a repeated subsampling process with 30 experimental runs is
carried out to evaluate the model prediction outcome. The subsampling process
verified by statistical test confirms that the GMM method trained by the
unsupervised learning approach has attained a better prediction accuracy
compared with the EM algorithm. Therefore, this method of GMM learning is
strongly recommended for other studies in the same field.</p>
      <p>Furthermore, the experiments demonstrate that the latent factor created by
RBFDA is really helpful in boosting the classification accuracy of the
BayGmmKda model. This melioration in accuracy of the BayGmmKda stems from
its integrated learning structure. As described earlier, the classification
task is performed by a hybridization of discrimination analysis and a Bayesian
framework. The Bayesian model carried out the classification task by
consideration of the patterns in the original dataset and an additional
factor produced from the discrimination analysis. As result, the performance
of the BayGmmKda model is better than those obtained from the three
benchmarks (SVM, ANFIS, and BayGmm).</p>
      <p>The main limitation in this work is that the BayGmmKda is a data-driven
tool; therefore, field works and GIS-based geoenvironmental data are
necessary for the model construction phase. This data collection and
analysis can be time-consuming. In addition, the grid search procedure is
used for hyper-parameter setting in the BayGmmKda model requires a high
computational cost, especially for large-scale datasets. Furthermore, the
outcome of this grid search procedure may not be optimal; therefore, more
advanced model selection approaches, i.e., metaheuristic optimization
algorithms, could be utilized to further improve the model accuracy.</p>
      <p>Despite such limitations, the proposed BayGmmKda model, featured by its high
predictive accuracy and the capability of delivering probabilistic outputs,
is a promising alternative for flood susceptibility prediction. Future
extensions of this research may include the model application in flood
prediction for other study areas, investigations of other flood-influencing
factors (i.e., streamflow and antecedent soil moisture which may be relevant
for flood analysis) and improving the current model with other novel soft
computing methods, i.e., feature selection, pattern classification, and
dimension reduction to alleviate the aforementioned drawbacks as well as to
enhance the model performance.</p>
</sec>

      
      </body>
    <back><notes notes-type="codedataavailability">

      <p>The MATLAB code of the BayGmmKda model is given in the
Supplement.</p>

      <p>The dataset used in this research is given in the Supplement.</p>
  </notes><app-group>
        <supplementary-material position="anchor"><p><bold>The Supplement related to this article is available online at <inline-supplementary-material xlink:href="https://doi.org/10.5194/gmd-10-3391-2017-supplement" xlink:title="">https://doi.org/10.5194/gmd-10-3391-2017-supplement</inline-supplementary-material>.</bold></p></supplementary-material>
        </app-group><notes notes-type="competinginterests">

      <p>The authors declare that they have no conflict of
interest.</p>
  </notes><ack><title>Acknowledgements</title><p>This research was partially supported by Department of Business and IT,
School of Business, University College of Southeast Norway. Data for this
research are from the project no. B2014-02-21 and were provided by
Quoc-Phi Nguyen (Hanoi University of Mining and Geology,
Vietnam).<?xmltex \hack{\newline}?><?xmltex \hack{\newline}?> Edited by: Jeffrey Neal<?xmltex \hack{\newline}?>
Reviewed by: two anonymous referees</p></ack><ref-list>
    <title>References</title>

      <ref id="bib1.bib1"><label>1</label><mixed-citation>Akaike, H.: A new look at the statistical identification model, IEEE T.
Automat. Contr., 19, 716–723, <ext-link xlink:href="https://doi.org/10.1109/TAC.1974.1100705" ext-link-type="DOI">10.1109/TAC.1974.1100705</ext-link>, 1974.</mixed-citation></ref>
      <ref id="bib1.bib2"><label>2</label><mixed-citation>
Alfieri, L., Salamon, P., Bianchi, A., Neal, J., Bates, P., and Feyen, L.:
Advances in pan-European flood hazard mapping, Hydrol. Process., 28,
4067–4077, 10.1002/hyp.9947, 2014.</mixed-citation></ref>
      <ref id="bib1.bib3"><label>3</label><mixed-citation>
Alfieri, L., Bisselink, B., Dottori, F., Naumann, G., Roo, A., Salamon, P.,
Wyser, K., and Feyen, L.: Global projections of river flood risk in a warmer
world, Earth's Future, 5, 171–182, 2017.</mixed-citation></ref>
      <ref id="bib1.bib4"><label>4</label><mixed-citation>Arellano, C. and Dahyot, R.: Robust ellipse detection with Gaussian mixture
models, Pattern Recognit., 58, 12–26, <ext-link xlink:href="https://doi.org/10.1016/j.patcog.2016.01.017" ext-link-type="DOI">10.1016/j.patcog.2016.01.017</ext-link>,
2016.</mixed-citation></ref>
      <ref id="bib1.bib5"><label>5</label><mixed-citation>Aronica, G. T., Franza, F., Bates, P. D., and Neal, J. C.: Probabilistic
evaluation of flood hazard in urban areas using Monte Carlo simulation,
Hydrol. Process., 26, 3962–3972, <ext-link xlink:href="https://doi.org/10.1002/hyp.8370" ext-link-type="DOI">10.1002/hyp.8370</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib6"><label>6</label><mixed-citation>
Bennett, J. C., Robertson, D. E., Ward, P. G., Hapuarachchi, H. P., and Wang,
Q.: Calibrating hourly rainfall-runoff models with daily forcings for
streamflow forecasting applications in meso-scale catchments, Environ. Model.
Softw., 76, 20–36, 2016.</mixed-citation></ref>
      <ref id="bib1.bib7"><label>7</label><mixed-citation>Beven, K. J., Kirkby, M. J., Schofield, N., and Tagg, A. F.: Testing a
physically-based flood forecasting model (TOPMODEL) for three U.K.
catchments, J. Hydrol., 69, 119–143, <ext-link xlink:href="https://doi.org/10.1016/0022-1694(84)90159-8" ext-link-type="DOI">10.1016/0022-1694(84)90159-8</ext-link>, 1984.</mixed-citation></ref>
      <ref id="bib1.bib8"><label>8</label><mixed-citation>Biernacki, C., Celeux, G., and Govaert, G.: Choosing starting values for the
EM algorithm for getting the highest likelihood in multivariate Gaussian
mixture models, Comput. Stat. Data An., 41, 561–575,
<ext-link xlink:href="https://doi.org/10.1016/S0167-9473(02)00163-9" ext-link-type="DOI">10.1016/S0167-9473(02)00163-9</ext-link>, 2003.</mixed-citation></ref>
      <ref id="bib1.bib9"><label>9</label><mixed-citation>Birkel, C., Tetzlaff, D., Dunn, S. M., and Soulsby, C.: Towards a simple
dynamic process conceptualization in rainfall–runoff models using
multi-criteria calibration and tracers in temperate, upland catchments,
Hydrol. Process., 24, 260–275, <ext-link xlink:href="https://doi.org/10.1002/hyp.7478" ext-link-type="DOI">10.1002/hyp.7478</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib10"><label>10</label><mixed-citation>
Brocca, L., Melone, F., and Moramarco, T.: Distributed rainfall-runoff
modelling for flood frequency estimation and flood forecasting, Hydrol.
Process., 25, 2801–2813, 2011.</mixed-citation></ref>
      <ref id="bib1.bib11"><label>11</label><mixed-citation>Bubeck, P., Botzen, W., and Aerts, J.: A review of risk perceptions and other
factors that influence flood mitigation behavior, Risk. Anal., 32,
1481–1495, <ext-link xlink:href="https://doi.org/10.1111/j.1539-6924.2011.01783.x" ext-link-type="DOI">10.1111/j.1539-6924.2011.01783.x</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib12"><label>12</label><mixed-citation>
Ciabatta, L., Brocca, L., Massari, C., Moramarco, T., Gabellani, S., Puca,
S., and Wagner, W.: Rainfall-runoff modelling by using SM2RAIN-derived and
state-of-the-art satellite rainfall products over Italy, Int. J. Appl. Earth
Obs., 48, 163–173, 2016.</mixed-citation></ref>
      <ref id="bib1.bib13"><label>13</label><mixed-citation>Cheng, M.-Y. and Hoang, N.-D.: Slope Collapse Prediction Using Bayesian
Framework with K-Nearest Neighbor Density Estimation: Case Study in Taiwan,
J. Comput. Civ. Eng., 30, 04014116, <ext-link xlink:href="https://doi.org/10.1061/(ASCE)CP.1943-5487.0000456" ext-link-type="DOI">10.1061/(ASCE)CP.1943-5487.0000456</ext-link>,
2016.</mixed-citation></ref>
      <ref id="bib1.bib14"><label>14</label><mixed-citation>Chiew, F. H. S., Stewardson, M. J., and McMahon, T. A.: Comparison of six
rainfall-runoff modelling approaches, J. Hydrol., 147, 1–36,
<ext-link xlink:href="https://doi.org/10.1016/0022-1694(93)90073-I" ext-link-type="DOI">10.1016/0022-1694(93)90073-I</ext-link>, 1993.</mixed-citation></ref>
      <ref id="bib1.bib15"><label>15</label><mixed-citation>
Cunnane, C.: Methods and merits of regional flood frequency analysis, J.
Hydrol., 100, 269–290, 1988.</mixed-citation></ref>
      <ref id="bib1.bib16"><label>16</label><mixed-citation>Dao, N.: Reflecting on the role of academics–activists in shifting
hydropower narratives in Vietnam, Crit. Asian Stud., 49, 444–447,
<ext-link xlink:href="https://doi.org/10.1080/14672715.2017.1339450" ext-link-type="DOI">10.1080/14672715.2017.1339450</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib17"><label>17</label><mixed-citation>Dottori, F., Salamon, P., Bianchi, A., Alfieri, L., Hirpa, F. A., and Feyen,
L.: Development and evaluation of a framework for global flood hazard
mapping, Adv. Water Resour., 94, 87–102,
<ext-link xlink:href="https://doi.org/10.1016/j.advwatres.2016.05.002" ext-link-type="DOI">10.1016/j.advwatres.2016.05.002</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib18"><label>18</label><mixed-citation>Fenicia, F., Savenije, H. H. G., Matgen, P., and Pfister, L.: Understanding
catchment behavior through stepwise model concept improvement, Water Resour.
Res., 44, W01402, <ext-link xlink:href="https://doi.org/10.1029/2006WR005563" ext-link-type="DOI">10.1029/2006WR005563</ext-link>, 2008.</mixed-citation></ref>
      <ref id="bib1.bib19"><label>19</label><mixed-citation>Figueiredo, M. A. T. and Jain, A. K.: Unsupervised learning of finite mixture
models, IEEE T. Pattern Anal., 24, 381–396, <ext-link xlink:href="https://doi.org/10.1109/34.990138" ext-link-type="DOI">10.1109/34.990138</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bib20"><label>20</label><mixed-citation>Gao, Z., Long, D., Tang, G., Zeng, C., Huang, J., and Hong, Y.: Assessing the
potential of satellite-based precipitation estimates for flood frequency
analysis in ungauged or poorly gauged tributaries of China's Yangtze River
basin, J. Hydrol., 550, 478–496, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2017.05.025" ext-link-type="DOI">10.1016/j.jhydrol.2017.05.025</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib21"><label>21</label><mixed-citation>Gómez-Losada, Á., Lozano-García, A., Pino-Mejías, R., and
Contreras-González, J.: Finite mixture models to characterize and refine
air quality monitoring networks, Sci. Total Environ., 485–486, 292–299,
<ext-link xlink:href="https://doi.org/10.1016/j.scitotenv.2014.03.091" ext-link-type="DOI">10.1016/j.scitotenv.2014.03.091</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib22"><label>22</label><mixed-citation>Grimaldi, S., Petroselli, A., Arcangeletti, E., and Nardi, F.: Flood mapping
in ungauged basins using fully continuous hydrologic–hydraulic modeling, J.
Hydrol., 487, 39–47, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2013.02.023" ext-link-type="DOI">10.1016/j.jhydrol.2013.02.023</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bib23"><label>23</label><mixed-citation>
Hirabayashi, Y., Mahendran, R., Koirala, S., Konoshima, L., Yamazaki, D.,
Watanabe, S., Kim, H., and Kanae, S.: Global flood risk under climate change,
Nat. Clim. Change, 3, 816–821, 2013.</mixed-citation></ref>
      <ref id="bib1.bib24"><label>24</label><mixed-citation>Hoang, N.-D. and Pham, A.-D.: Hybrid artificial intelligence approach based
on metaheuristic and machine learning for slope stability assessment: A
multinational data analysis, Expert. Syst. Appl., 46, 60–68,
<ext-link xlink:href="https://doi.org/10.1016/j.eswa.2015.10.020" ext-link-type="DOI">10.1016/j.eswa.2015.10.020</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib25"><label>25</label><mixed-citation>Hoang, N.-D. and Tien-Bui, D.: A Novel Relevance Vector Machine Classifier
with Cuckoo Search Optimization for Spatial Prediction of Landslides, J.
Comput. Civ. Eng., 30, 04016001, <ext-link xlink:href="https://doi.org/10.1061/(ASCE)CP.1943-5487.0000557" ext-link-type="DOI">10.1061/(ASCE)CP.1943-5487.0000557</ext-link>,
2016.</mixed-citation></ref>
      <ref id="bib1.bib26"><label>26</label><mixed-citation>Hoang, N.-D., Tien Bui, D., and Liao, K.-W.: Groutability estimation of
grouting processes with cement grouts using Differential Flower Pollination
Optimized Support Vector Machine, Appl. Soft Comput., 45, 173–186,
<ext-link xlink:href="https://doi.org/10.1016/j.asoc.2016.04.031" ext-link-type="DOI">10.1016/j.asoc.2016.04.031</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib27"><label>27</label><mixed-citation>Ju, Z. and Liu, H.: Fuzzy Gaussian Mixture Models, Pattern Recognit., 45,
1146–1158, <ext-link xlink:href="https://doi.org/10.1016/j.patcog.2011.08.028" ext-link-type="DOI">10.1016/j.patcog.2011.08.028</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib28"><label>28</label><mixed-citation>Kazakis, N., Kougias, I., and Patsialis, T.: Assessment of flood hazard areas
at a regional scale using an index-based approach and Analytical Hierarchy
Process: Application in Rhodope–Evros region, Greece, Sci. Total Environ.,
538, 555–563, <ext-link xlink:href="https://doi.org/10.1016/j.scitotenv.2015.08.055" ext-link-type="DOI">10.1016/j.scitotenv.2015.08.055</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib29"><label>29</label><mixed-citation>Khanmohammadi, S. and Chou, C.-A.: A Gaussian mixture model based
discretization algorithm for associative classification of medical data,
Expert Syst. Appl., 58, 119–129, <ext-link xlink:href="https://doi.org/10.1016/j.eswa.2016.03.046" ext-link-type="DOI">10.1016/j.eswa.2016.03.046</ext-link>,
2016.</mixed-citation></ref>
      <ref id="bib1.bib30"><label>30</label><mixed-citation>Kia, M. B., Pirasteh, S., Pradhan, B., Mahmud, A. R., Sulaiman, W. N. A., and
Moradi, A.: An artificial neural network model for flood simulation using
GIS: Johor River Basin, Malaysia, Environ. Earth Sci., 67, 251–264,
<ext-link xlink:href="https://doi.org/10.1007/s12665-011-1504-z" ext-link-type="DOI">10.1007/s12665-011-1504-z</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib31"><label>31</label><mixed-citation>Komi, K., Neal, J., Trigg, M. A., and Diekkrüger, B.: Modelling of flood
hazard extent in data sparse areas: a case study of the Oti River basin, West
Africa, J. Hydrol., 10, 122–132, <ext-link xlink:href="https://doi.org/10.1016/j.ejrh.2017.03.001" ext-link-type="DOI">10.1016/j.ejrh.2017.03.001</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib32"><label>32</label><mixed-citation>
Kreft, S., Eckstein, D., Junghans, L., Kerestan, C., and Hagen, U.: Global
climate risk index 2015: Who suffers most from extreme weather events, Report
from Germanwatch, 1–31, 2014.</mixed-citation></ref>
      <ref id="bib1.bib33"><label>33</label><mixed-citation>Kwak, N. and Choi, C.-H.: Input feature selection by mutual information based
on Parzen window, IEEE T. Pattern Anal., 24, 1667–1671,
<ext-link xlink:href="https://doi.org/10.1109/TPAMI.2002.1114861" ext-link-type="DOI">10.1109/TPAMI.2002.1114861</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bib34"><label>34</label><mixed-citation>
Lee, M. J., Kang, J. E., and Jeon, S.: Application of frequency ratio model
and validation for predictive flooded area susceptibility mapping using GIS,
Int. Geosci. Remote Se., 895–898, 2012.</mixed-citation></ref>
      <ref id="bib1.bib35"><label>35</label><mixed-citation>
Lohani, A. K., Goel, N., and Bhatia, K.: Comparative study of neural network,
fuzzy logic and linear transfer function techniques in daily rainfall-runoff
modelling under different input domains, Hydrol. Process., 25, 175–193,
2011.</mixed-citation></ref>
      <ref id="bib1.bib36"><label>36</label><mixed-citation>Loo, Y. Y., Billa, L., and Singh, A.: Effect of climate change on seasonal
monsoon in Asia and its impact on the variability of monsoon rainfall in
Southeast Asia, Geosci. Front., 6, 817–823, <ext-link xlink:href="https://doi.org/10.1016/j.gsf.2014.02.009" ext-link-type="DOI">10.1016/j.gsf.2014.02.009</ext-link>,
2015.</mixed-citation></ref>
      <ref id="bib1.bib37"><label>37</label><mixed-citation>Machado, M. J., Botero, B. A., López, J., Francés, F.,
Díez-Herrero, A., and Benito, G.: Flood frequency analysis of historical
flood data under stationary and non-stationary modelling, Hydrol. Earth Syst.
Sci., 19, 2561–2576, <ext-link xlink:href="https://doi.org/10.5194/hess-19-2561-2015" ext-link-type="DOI">10.5194/hess-19-2561-2015</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib38"><label>38</label><mixed-citation>
Manley, P. N., Mortenson, L., Halperin, J. J., and Quyen, N. H.: Options for
monitoring forest degradation in Northern Viet Nam: An assessment in systems
design and capacity building needs in Con Cuong District, Nghe An Province,
USAID Asia Final Report, 2013.</mixed-citation></ref>
      <ref id="bib1.bib39"><label>39</label><mixed-citation>Mason, D. C., Speck, R., Devereux, B., Schumann, G. J. P., Neal, J. C., and
Bates, P. D.: Flood Detection in Urban Areas Using TerraSAR-X, IEEE T.
Geosci. Remote, 48, 882–894, <ext-link xlink:href="https://doi.org/10.1109/TGRS.2009.2029236" ext-link-type="DOI">10.1109/TGRS.2009.2029236</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bib40"><label>40</label><mixed-citation>
MathWorks: Statistics Toolbox, The MathWorks, Inc., 2012a.</mixed-citation></ref>
      <ref id="bib1.bib41"><label>41</label><mixed-citation>
MathWorks: Bioinformatics Toolbox, The MathWorks, Inc., 2012b.</mixed-citation></ref>
      <ref id="bib1.bib42"><label>42</label><mixed-citation>
McCuen, R. H.: Modeling hydrologic change: statistical methods, CRC press,
448 pp., 2016.</mixed-citation></ref>
      <ref id="bib1.bib43"><label>43</label><mixed-citation>
McLachlan, G. and Krishnan, T.: The EM Algorithm and Extensions, 2nd Edition,
Wiley Series in Probability and Statistics, John Wiley &amp; Sons, Hoboken,
New Jersey, USA, 2008.</mixed-citation></ref>
      <ref id="bib1.bib44"><label>44</label><mixed-citation>
McLachlan, G. and Peel, D.: Finite Mixture Models, Wiley-Interscience, 1st
Edn., Printed United States 2000.</mixed-citation></ref>
      <ref id="bib1.bib45"><label>45</label><mixed-citation>Mika, S., Rätsch, G., Weston, J., Schölkopf, B., and Müller, K.:
Fisher discriminant analysis with kernels, Proceedings of the 1999 IEEE
Neural Networks for Signal Processing, Madison, WI, 23–25 August 1999,
41–48, <ext-link xlink:href="https://doi.org/10.1109/NNSP.1999.788121" ext-link-type="DOI">10.1109/NNSP.1999.788121</ext-link>, 1999.</mixed-citation></ref>
      <ref id="bib1.bib46"><label>46</label><mixed-citation>Mukerji, A., Chatterjee, C., and Raghuwanshi, N. S.: Flood Forecasting Using
ANN, Neuro-Fuzzy, and Neuro-GA Models, J. Hydrol. Eng., 14, 647–652,
<ext-link xlink:href="https://doi.org/10.1061/(ASCE)HE.1943-5584.0000040" ext-link-type="DOI">10.1061/(ASCE)HE.1943-5584.0000040</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bib47"><label>47</label><mixed-citation>Nayak, P. C., Venkatesh, B., Krishna, B., and Jain, S. K.: Rainfall-runoff
modeling using conceptual, data driven, and wavelet based computing approach,
J. Hydrol., 493, 57–67, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2013.04.016" ext-link-type="DOI">10.1016/j.jhydrol.2013.04.016</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bib48"><label>48</label><mixed-citation>Neal, J., Keef, C., Bates, P., Beven, K., and Leedal, D.: Probabilistic flood
risk mapping including spatial dependence, Hydrol. Process., 27, 1349–1363,
<ext-link xlink:href="https://doi.org/10.1002/hyp.9572" ext-link-type="DOI">10.1002/hyp.9572</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bib49"><label>49</label><mixed-citation>
Nguyen, C. C., Gaume, E., and Payrastre, O.: Regional flood frequency
analyses involving extraordinary flood events at ungauged sites: further
developments and validations, J. Hydrol., 508, 385–396, 2014.</mixed-citation></ref>
      <ref id="bib1.bib50"><label>50</label><mixed-citation>
Olivier, C., Jouzel, F., and Matouat, A. E.: Choice of the Number of
Component Clusters in Mixture Models by Information Criteria, Proceedings of
the Vision Interface'99, Trois-Rivieres, Quebec, Canada, 18–21 May 1999,
74–81, 1999.</mixed-citation></ref>
      <ref id="bib1.bib51"><label>51</label><mixed-citation>
Paalanen, P.: Bayesian classification using Gaussian mixture model and EM
estimation: implementations and comparisons, Technical Report, Department of
Information Technology, Lappeenranta University of Technology, 2004.</mixed-citation></ref>
      <ref id="bib1.bib52"><label>52</label><mixed-citation>
Papaioannou, G., Vasiliades, L., and Loukas, A.: Multi-criteria analysis
framework for potential flood prone areas mapping, Water Resour. Manage., 29,
399–418, 2015.</mixed-citation></ref>
      <ref id="bib1.bib53"><label>53</label><mixed-citation>Pulvirenti, L., Pierdicca, N., Chini, M., and Guerriero, L.: An algorithm for
operational flood mapping from Synthetic Aperture Radar (SAR) data using
fuzzy logic, Nat. Hazards Earth Syst. Sci., 11, 529–540,
<ext-link xlink:href="https://doi.org/10.5194/nhess-11-529-2011" ext-link-type="DOI">10.5194/nhess-11-529-2011</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bib54"><label>54</label><mixed-citation>Radmehr, A. and Araghinejad, S.: Developing Strategies for Urban Flood
Management of Tehran City Using SMCDM and ANN, J. Comput. Civ. Eng., 28,
05014006, <ext-link xlink:href="https://doi.org/10.1061/(ASCE)CP.1943-5487.0000360" ext-link-type="DOI">10.1061/(ASCE)CP.1943-5487.0000360</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib55"><label>55</label><mixed-citation>
Reynaud, A. and Nguyen, M.-H.: Valuing Flood Risk Reductions, Environ. Model.
Assess., 21, 603–617, 10.1007/s10666-016-9500-z, 2016.</mixed-citation></ref>
      <ref id="bib1.bib56"><label>56</label><mixed-citation>Rezaeianzadeh, M., Tabari, H., Arabi Yazdi, A., Isik, S., and Kalin, L.:
Flood flow forecasting using ANN, ANFIS and regression models, Neural Comput.
Appl., 25, 25–37, <ext-link xlink:href="https://doi.org/10.1007/s00521-013-1443-6" ext-link-type="DOI">10.1007/s00521-013-1443-6</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib57"><label>57</label><mixed-citation>Seckin, N., Cobaner, M., Yurtal, R., and Haktanir, T.: Comparison of
Artificial Neural Network Methods with L-moments for Estimating Flood Flow at
Ungauged Sites: the Case of East Mediterranean River Basin, Turkey, Water
Resour. Manage., 27, 2103–2124, <ext-link xlink:href="https://doi.org/10.1007/s11269-013-0278-3" ext-link-type="DOI">10.1007/s11269-013-0278-3</ext-link>, 2013a.</mixed-citation></ref>
      <ref id="bib1.bib58"><label>58</label><mixed-citation>
Seckin, N., Cobaner, M., Yurtal, R., and Haktanir, T.: Comparison of
artificial neural network methods with L-moments for estimating flood flow at
ungauged sites: the case of East Mediterranean River Basin, Turkey, Water
Resour. Manage., 27, 2103–2124, 2013b.</mixed-citation></ref>
      <ref id="bib1.bib59"><label>59</label><mixed-citation>
Tehrany, M. S., Pradhan, B., and Jebur, M. N.: Flood susceptibility mapping
using a novel ensemble weights-of-evidence and support vector machine models
in GIS, J. Hydrol., 512, 332–343, 2014.</mixed-citation></ref>
      <ref id="bib1.bib60"><label>60</label><mixed-citation>Tehrany, M. S., Pradhan, B., and Jebur, M. N.: Flood susceptibility analysis
and its verification using a novel ensemble support vector machine and
frequency ratio method, Stoch. Env. Res. Risk A, 29, 1149–1165,
<ext-link xlink:href="https://doi.org/10.1007/s00477-015-1021-9" ext-link-type="DOI">10.1007/s00477-015-1021-9</ext-link>, 2015a.</mixed-citation></ref>
      <ref id="bib1.bib61"><label>61</label><mixed-citation>Tehrany, M. S., Pradhan, B., Mansor, S., and Ahmad, N.: Flood susceptibility
assessment using GIS-based support vector machine model with different kernel
types, CATENA, 125, 91–101, <ext-link xlink:href="https://doi.org/10.1016/j.catena.2014.10.017" ext-link-type="DOI">10.1016/j.catena.2014.10.017</ext-link>, 2015b.</mixed-citation></ref>
      <ref id="bib1.bib62"><label>62</label><mixed-citation>
Theodoridis, S.: Machine Learning: A Bayesian and Optimization Perspective,
Academic Press, Elsevier, USA, 2015.</mixed-citation></ref>
      <ref id="bib1.bib63"><label>63</label><mixed-citation>
Theodoridis, S. and Koutroumbas, K.: Pattern Recognition, Academic Press,
Elsevier Inc., USA, 2009.</mixed-citation></ref>
      <ref id="bib1.bib64"><label>64</label><mixed-citation>
Tien Bui, D., Le, K.-T., Nguyen, V., Le, H., and Revhaug, I.: Tropical Forest
Fire Susceptibility Mapping at the Cat Ba National Park Area, Hai Phong City,
Vietnam, Using GIS-Based Kernel Logistic Regression, Remote Sens., 8, 347,
2016a.</mixed-citation></ref>
      <ref id="bib1.bib65"><label>65</label><mixed-citation>Tien Bui, D., Nguyen, Q. P., Hoang, N.-D., and Klempe, H.: A novel fuzzy
K-nearest neighbor inference model with differential evolution for spatial
prediction of rainfall-induced shallow landslides in a tropical hilly area
using GIS, Landslides, 14, 1–17, <ext-link xlink:href="https://doi.org/10.1007/s10346-016-0708-4" ext-link-type="DOI">10.1007/s10346-016-0708-4</ext-link>, 2016b.</mixed-citation></ref>
      <ref id="bib1.bib66"><label>66</label><mixed-citation>Tien Bui, D., Pradhan, B., Nampak, H., Bui, Q.-T., Tran, Q.-A., and Nguyen,
Q.-P.: Hybrid artificial intelligence approach based on neural fuzzy
inference model and metaheuristic optimization for flood susceptibilitgy
modeling in a high-frequency tropical cyclone area using GIS, J. Hydrol.,
540, 317–330, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2016.06.027" ext-link-type="DOI">10.1016/j.jhydrol.2016.06.027</ext-link>, 2016c.</mixed-citation></ref>
      <ref id="bib1.bib67"><label>67</label><mixed-citation>Tien Bui, D., Pradhan, B., Nampak, H., Quang Bui, T., Tran, Q.-A., and
Nguyen, Q. P.: Hybrid Artificial Intelligence Approach Based on Neural Fuzzy
Inference Model and Metaheuristic Optimization for Flood Susceptibility
Modelling in A High-Frequency Tropical Cyclone Area using GIS, J. Hydrol.,
540, 317–330, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2016.06.027" ext-link-type="DOI">10.1016/j.jhydrol.2016.06.027</ext-link>, 2016d.</mixed-citation></ref>
      <ref id="bib1.bib68"><label>68</label><mixed-citation>Tien Bui, D., Tuan, T. A., Klempe, H., Pradhan, B., and Revhaug, I.: Spatial
prediction models for shallow landslide hazards: a comparative assessment of
the efficacy of support vector machines, artificial neural networks, kernel
logistic regression, and logistic model tree, Landslides, 13, 361–378,
<ext-link xlink:href="https://doi.org/10.1007/s10346-015-0557-6" ext-link-type="DOI">10.1007/s10346-015-0557-6</ext-link>, 2016e.</mixed-citation></ref>
      <ref id="bib1.bib69"><label>69</label><mixed-citation>Tien Bui, D., Bui, Q.-T., Nguyen, Q.-P., Pradhan, B., Nampak, H., and Trinh,
P. T.: A hybrid artificial intelligence approach using GIS-based neural-fuzzy
inference system and particle swarm optimization for forest fire
susceptibility modeling at a tropical area, Agr. Forest Meteorol., 233,
32–44, <ext-link xlink:href="https://doi.org/10.1016/j.agrformet.2016.11.002" ext-link-type="DOI">10.1016/j.agrformet.2016.11.002</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib70"><label>70</label><mixed-citation>van Erkel, A. R. and Pattynama, P. M. T.: Receiver operating characteristic
(ROC) analysis: Basic principles and applications in radiology, Eur. J.
Radiol., 27, 88–94, <ext-link xlink:href="https://doi.org/10.1016/S0720-048X(97)00157-5" ext-link-type="DOI">10.1016/S0720-048X(97)00157-5</ext-link>, 1998.</mixed-citation></ref>
      <ref id="bib1.bib71"><label>71</label><mixed-citation>Wallace, C. S. and Dowe, D. L.: Minimum Message Length and Kolmogorov
Complexity, Comput. J., 42, 270–283, <ext-link xlink:href="https://doi.org/10.1093/comjnl/42.4.270" ext-link-type="DOI">10.1093/comjnl/42.4.270</ext-link>, 1999.</mixed-citation></ref>
      <ref id="bib1.bib72"><label>72</label><mixed-citation>
Webb , A. R. and Copsey, K. D.: Statistical Pattern Recognition, John Wiley
&amp; Sons, UK, 2011.</mixed-citation></ref>
      <ref id="bib1.bib73"><label>73</label><mixed-citation>Winsemius, H. C., Van Beek, L. P. H., Jongman, B., Ward, P. J., and Bouwman,
A.: A framework for global river flood risk assessments, Hydrol. Earth Syst.
Sci., 17, 1871–1892, <ext-link xlink:href="https://doi.org/10.5194/hess-17-1871-2013" ext-link-type="DOI">10.5194/hess-17-1871-2013</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bib74"><label>74</label><mixed-citation>
Winsemius, H. C., Aerts, J. C., van Beek, L. P., Bierkens, M. F., Bouwman,
A., Jongman, B., Kwadijk, J. C., Ligtvoet, W., Lucas, P. L., and van Vuuren,
D. P.: Global drivers of future river flood risk, Nat. Clim. Change, 6,
381–385, 2015.</mixed-citation></ref>
      <ref id="bib1.bib75"><label>75</label><mixed-citation>Yu, J.: Localized Fisher discriminant analysis based complex chemical process
monitoring, AICHE J., 57, 1817–1828, 2011.
 </mixed-citation></ref><?xmltex \hack{\newpage}?>
      <ref id="bib1.bib76"><label>76</label><mixed-citation>
Yue, S., Ouarda, T., Bobée, B., Legendre, P., and Bruneau, P.: The Gumbel
mixed model for flood frequency analysis, J. Hydrol., 226, 88–100, 1999.</mixed-citation></ref>
      <ref id="bib1.bib77"><label>77</label><mixed-citation>Zhang, G., Mahfouf, M., Abdulkareem, M., Gaffour, S.-A., Yang, Y.-Y.,
Obajemu, O., Yates, J., Soberanis, S. A., and Pinna, C.: Hybrid-modelling of
compact tension energy in high strength pipeline steel using a Gaussian
Mixture Model based error compensation, Appl. Soft Comput., 48, 1–12,
<ext-link xlink:href="https://doi.org/10.1016/j.asoc.2016.06.007" ext-link-type="DOI">10.1016/j.asoc.2016.06.007</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bib78"><label>78</label><mixed-citation>Zhou, Z., Liu, S., Zhong, G., and Cai, Y.: Flood Disaster and Flood Control
Measurements in Shanghai, Nat. Hazards Rev., 18, B5016001,
<ext-link xlink:href="https://doi.org/10.1061/(ASCE)NH.1527-6996.0000213" ext-link-type="DOI">10.1061/(ASCE)NH.1527-6996.0000213</ext-link>, 2016.</mixed-citation></ref>

  </ref-list><app-group content-type="float"><app><title/>

    </app></app-group></back>
    <!--<article-title-html>A Bayesian framework based on a Gaussian mixture model and radial-basis-function Fisher discriminant analysis (BayGmmKda V1.1) for spatial prediction of floods</article-title-html>
<abstract-html><p class="p">In this study, a probabilistic model, named as BayGmmKda, is
proposed for flood susceptibility assessment in a study area in central
Vietnam. The new model is a Bayesian framework constructed by a combination
of a Gaussian mixture model (GMM), radial-basis-function Fisher discriminant
analysis (RBFDA), and a geographic information system (GIS) database. In the
Bayesian framework, GMM is used for modeling the data distribution of
flood-influencing factors in the GIS database, whereas RBFDA is utilized to
construct a latent variable that aims at enhancing the model performance. As
a result, the posterior probabilistic output of the BayGmmKda model is used
as flood susceptibility index. Experiment results showed that the proposed
hybrid framework is superior to other benchmark models, including the
adaptive neuro-fuzzy inference system and the support vector machine. To
facilitate the model implementation, a software program of BayGmmKda has
been developed in MATLAB. The BayGmmKda program can accurately establish a
flood susceptibility map for the study region. Accordingly, local
authorities can overlay this susceptibility map onto various land-use maps
for the purpose of land-use planning or management.</p></abstract-html>
<ref-html id="bib1.bib1"><label>1</label><mixed-citation>
Akaike, H.: A new look at the statistical identification model, IEEE T.
Automat. Contr., 19, 716–723, <a href="https://doi.org/10.1109/TAC.1974.1100705" target="_blank">https://doi.org/10.1109/TAC.1974.1100705</a>, 1974.
</mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>2</label><mixed-citation>
Alfieri, L., Salamon, P., Bianchi, A., Neal, J., Bates, P., and Feyen, L.:
Advances in pan-European flood hazard mapping, Hydrol. Process., 28,
4067–4077, 10.1002/hyp.9947, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>3</label><mixed-citation>
Alfieri, L., Bisselink, B., Dottori, F., Naumann, G., Roo, A., Salamon, P.,
Wyser, K., and Feyen, L.: Global projections of river flood risk in a warmer
world, Earth's Future, 5, 171–182, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>4</label><mixed-citation>
Arellano, C. and Dahyot, R.: Robust ellipse detection with Gaussian mixture
models, Pattern Recognit., 58, 12–26, <a href="https://doi.org/10.1016/j.patcog.2016.01.017" target="_blank">https://doi.org/10.1016/j.patcog.2016.01.017</a>,
2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>5</label><mixed-citation>
Aronica, G. T., Franza, F., Bates, P. D., and Neal, J. C.: Probabilistic
evaluation of flood hazard in urban areas using Monte Carlo simulation,
Hydrol. Process., 26, 3962–3972, <a href="https://doi.org/10.1002/hyp.8370" target="_blank">https://doi.org/10.1002/hyp.8370</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>6</label><mixed-citation>
Bennett, J. C., Robertson, D. E., Ward, P. G., Hapuarachchi, H. P., and Wang,
Q.: Calibrating hourly rainfall-runoff models with daily forcings for
streamflow forecasting applications in meso-scale catchments, Environ. Model.
Softw., 76, 20–36, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>7</label><mixed-citation>
Beven, K. J., Kirkby, M. J., Schofield, N., and Tagg, A. F.: Testing a
physically-based flood forecasting model (TOPMODEL) for three U.K.
catchments, J. Hydrol., 69, 119–143, <a href="https://doi.org/10.1016/0022-1694(84)90159-8" target="_blank">https://doi.org/10.1016/0022-1694(84)90159-8</a>, 1984.
</mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>8</label><mixed-citation>
Biernacki, C., Celeux, G., and Govaert, G.: Choosing starting values for the
EM algorithm for getting the highest likelihood in multivariate Gaussian
mixture models, Comput. Stat. Data An., 41, 561–575,
<a href="https://doi.org/10.1016/S0167-9473(02)00163-9" target="_blank">https://doi.org/10.1016/S0167-9473(02)00163-9</a>, 2003.
</mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>9</label><mixed-citation>
Birkel, C., Tetzlaff, D., Dunn, S. M., and Soulsby, C.: Towards a simple
dynamic process conceptualization in rainfall–runoff models using
multi-criteria calibration and tracers in temperate, upland catchments,
Hydrol. Process., 24, 260–275, <a href="https://doi.org/10.1002/hyp.7478" target="_blank">https://doi.org/10.1002/hyp.7478</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>10</label><mixed-citation>
Brocca, L., Melone, F., and Moramarco, T.: Distributed rainfall-runoff
modelling for flood frequency estimation and flood forecasting, Hydrol.
Process., 25, 2801–2813, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>11</label><mixed-citation>
Bubeck, P., Botzen, W., and Aerts, J.: A review of risk perceptions and other
factors that influence flood mitigation behavior, Risk. Anal., 32,
1481–1495, <a href="https://doi.org/10.1111/j.1539-6924.2011.01783.x" target="_blank">https://doi.org/10.1111/j.1539-6924.2011.01783.x</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>12</label><mixed-citation>
Ciabatta, L., Brocca, L., Massari, C., Moramarco, T., Gabellani, S., Puca,
S., and Wagner, W.: Rainfall-runoff modelling by using SM2RAIN-derived and
state-of-the-art satellite rainfall products over Italy, Int. J. Appl. Earth
Obs., 48, 163–173, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>13</label><mixed-citation>
Cheng, M.-Y. and Hoang, N.-D.: Slope Collapse Prediction Using Bayesian
Framework with K-Nearest Neighbor Density Estimation: Case Study in Taiwan,
J. Comput. Civ. Eng., 30, 04014116, <a href="https://doi.org/10.1061/(ASCE)CP.1943-5487.0000456" target="_blank">https://doi.org/10.1061/(ASCE)CP.1943-5487.0000456</a>,
2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>14</label><mixed-citation>
Chiew, F. H. S., Stewardson, M. J., and McMahon, T. A.: Comparison of six
rainfall-runoff modelling approaches, J. Hydrol., 147, 1–36,
<a href="https://doi.org/10.1016/0022-1694(93)90073-I" target="_blank">https://doi.org/10.1016/0022-1694(93)90073-I</a>, 1993.
</mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>15</label><mixed-citation>
Cunnane, C.: Methods and merits of regional flood frequency analysis, J.
Hydrol., 100, 269–290, 1988.
</mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>16</label><mixed-citation>
Dao, N.: Reflecting on the role of academics–activists in shifting
hydropower narratives in Vietnam, Crit. Asian Stud., 49, 444–447,
<a href="https://doi.org/10.1080/14672715.2017.1339450" target="_blank">https://doi.org/10.1080/14672715.2017.1339450</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>17</label><mixed-citation>
Dottori, F., Salamon, P., Bianchi, A., Alfieri, L., Hirpa, F. A., and Feyen,
L.: Development and evaluation of a framework for global flood hazard
mapping, Adv. Water Resour., 94, 87–102,
<a href="https://doi.org/10.1016/j.advwatres.2016.05.002" target="_blank">https://doi.org/10.1016/j.advwatres.2016.05.002</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>18</label><mixed-citation>
Fenicia, F., Savenije, H. H. G., Matgen, P., and Pfister, L.: Understanding
catchment behavior through stepwise model concept improvement, Water Resour.
Res., 44, W01402, <a href="https://doi.org/10.1029/2006WR005563" target="_blank">https://doi.org/10.1029/2006WR005563</a>, 2008.
</mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>19</label><mixed-citation>
Figueiredo, M. A. T. and Jain, A. K.: Unsupervised learning of finite mixture
models, IEEE T. Pattern Anal., 24, 381–396, <a href="https://doi.org/10.1109/34.990138" target="_blank">https://doi.org/10.1109/34.990138</a>, 2002.
</mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>20</label><mixed-citation>
Gao, Z., Long, D., Tang, G., Zeng, C., Huang, J., and Hong, Y.: Assessing the
potential of satellite-based precipitation estimates for flood frequency
analysis in ungauged or poorly gauged tributaries of China's Yangtze River
basin, J. Hydrol., 550, 478–496, <a href="https://doi.org/10.1016/j.jhydrol.2017.05.025" target="_blank">https://doi.org/10.1016/j.jhydrol.2017.05.025</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>21</label><mixed-citation>
Gómez-Losada, Á., Lozano-García, A., Pino-Mejías, R., and
Contreras-González, J.: Finite mixture models to characterize and refine
air quality monitoring networks, Sci. Total Environ., 485–486, 292–299,
<a href="https://doi.org/10.1016/j.scitotenv.2014.03.091" target="_blank">https://doi.org/10.1016/j.scitotenv.2014.03.091</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>22</label><mixed-citation>
Grimaldi, S., Petroselli, A., Arcangeletti, E., and Nardi, F.: Flood mapping
in ungauged basins using fully continuous hydrologic–hydraulic modeling, J.
Hydrol., 487, 39–47, <a href="https://doi.org/10.1016/j.jhydrol.2013.02.023" target="_blank">https://doi.org/10.1016/j.jhydrol.2013.02.023</a>, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>23</label><mixed-citation>
Hirabayashi, Y., Mahendran, R., Koirala, S., Konoshima, L., Yamazaki, D.,
Watanabe, S., Kim, H., and Kanae, S.: Global flood risk under climate change,
Nat. Clim. Change, 3, 816–821, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>24</label><mixed-citation>
Hoang, N.-D. and Pham, A.-D.: Hybrid artificial intelligence approach based
on metaheuristic and machine learning for slope stability assessment: A
multinational data analysis, Expert. Syst. Appl., 46, 60–68,
<a href="https://doi.org/10.1016/j.eswa.2015.10.020" target="_blank">https://doi.org/10.1016/j.eswa.2015.10.020</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>25</label><mixed-citation>
Hoang, N.-D. and Tien-Bui, D.: A Novel Relevance Vector Machine Classifier
with Cuckoo Search Optimization for Spatial Prediction of Landslides, J.
Comput. Civ. Eng., 30, 04016001, <a href="https://doi.org/10.1061/(ASCE)CP.1943-5487.0000557" target="_blank">https://doi.org/10.1061/(ASCE)CP.1943-5487.0000557</a>,
2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>26</label><mixed-citation>
Hoang, N.-D., Tien Bui, D., and Liao, K.-W.: Groutability estimation of
grouting processes with cement grouts using Differential Flower Pollination
Optimized Support Vector Machine, Appl. Soft Comput., 45, 173–186,
<a href="https://doi.org/10.1016/j.asoc.2016.04.031" target="_blank">https://doi.org/10.1016/j.asoc.2016.04.031</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>27</label><mixed-citation>
Ju, Z. and Liu, H.: Fuzzy Gaussian Mixture Models, Pattern Recognit., 45,
1146–1158, <a href="https://doi.org/10.1016/j.patcog.2011.08.028" target="_blank">https://doi.org/10.1016/j.patcog.2011.08.028</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>28</label><mixed-citation>
Kazakis, N., Kougias, I., and Patsialis, T.: Assessment of flood hazard areas
at a regional scale using an index-based approach and Analytical Hierarchy
Process: Application in Rhodope–Evros region, Greece, Sci. Total Environ.,
538, 555–563, <a href="https://doi.org/10.1016/j.scitotenv.2015.08.055" target="_blank">https://doi.org/10.1016/j.scitotenv.2015.08.055</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>29</label><mixed-citation>
Khanmohammadi, S. and Chou, C.-A.: A Gaussian mixture model based
discretization algorithm for associative classification of medical data,
Expert Syst. Appl., 58, 119–129, <a href="https://doi.org/10.1016/j.eswa.2016.03.046" target="_blank">https://doi.org/10.1016/j.eswa.2016.03.046</a>,
2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>30</label><mixed-citation>
Kia, M. B., Pirasteh, S., Pradhan, B., Mahmud, A. R., Sulaiman, W. N. A., and
Moradi, A.: An artificial neural network model for flood simulation using
GIS: Johor River Basin, Malaysia, Environ. Earth Sci., 67, 251–264,
<a href="https://doi.org/10.1007/s12665-011-1504-z" target="_blank">https://doi.org/10.1007/s12665-011-1504-z</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>31</label><mixed-citation>
Komi, K., Neal, J., Trigg, M. A., and Diekkrüger, B.: Modelling of flood
hazard extent in data sparse areas: a case study of the Oti River basin, West
Africa, J. Hydrol., 10, 122–132, <a href="https://doi.org/10.1016/j.ejrh.2017.03.001" target="_blank">https://doi.org/10.1016/j.ejrh.2017.03.001</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>32</label><mixed-citation>
Kreft, S., Eckstein, D., Junghans, L., Kerestan, C., and Hagen, U.: Global
climate risk index 2015: Who suffers most from extreme weather events, Report
from Germanwatch, 1–31, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>33</label><mixed-citation>
Kwak, N. and Choi, C.-H.: Input feature selection by mutual information based
on Parzen window, IEEE T. Pattern Anal., 24, 1667–1671,
<a href="https://doi.org/10.1109/TPAMI.2002.1114861" target="_blank">https://doi.org/10.1109/TPAMI.2002.1114861</a>, 2002.
</mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>34</label><mixed-citation>
Lee, M. J., Kang, J. E., and Jeon, S.: Application of frequency ratio model
and validation for predictive flooded area susceptibility mapping using GIS,
Int. Geosci. Remote Se., 895–898, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>35</label><mixed-citation>
Lohani, A. K., Goel, N., and Bhatia, K.: Comparative study of neural network,
fuzzy logic and linear transfer function techniques in daily rainfall-runoff
modelling under different input domains, Hydrol. Process., 25, 175–193,
2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>36</label><mixed-citation>
Loo, Y. Y., Billa, L., and Singh, A.: Effect of climate change on seasonal
monsoon in Asia and its impact on the variability of monsoon rainfall in
Southeast Asia, Geosci. Front., 6, 817–823, <a href="https://doi.org/10.1016/j.gsf.2014.02.009" target="_blank">https://doi.org/10.1016/j.gsf.2014.02.009</a>,
2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>37</label><mixed-citation>
Machado, M. J., Botero, B. A., López, J., Francés, F.,
Díez-Herrero, A., and Benito, G.: Flood frequency analysis of historical
flood data under stationary and non-stationary modelling, Hydrol. Earth Syst.
Sci., 19, 2561–2576, <a href="https://doi.org/10.5194/hess-19-2561-2015" target="_blank">https://doi.org/10.5194/hess-19-2561-2015</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>38</label><mixed-citation>
Manley, P. N., Mortenson, L., Halperin, J. J., and Quyen, N. H.: Options for
monitoring forest degradation in Northern Viet Nam: An assessment in systems
design and capacity building needs in Con Cuong District, Nghe An Province,
USAID Asia Final Report, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>39</label><mixed-citation>
Mason, D. C., Speck, R., Devereux, B., Schumann, G. J. P., Neal, J. C., and
Bates, P. D.: Flood Detection in Urban Areas Using TerraSAR-X, IEEE T.
Geosci. Remote, 48, 882–894, <a href="https://doi.org/10.1109/TGRS.2009.2029236" target="_blank">https://doi.org/10.1109/TGRS.2009.2029236</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>40</label><mixed-citation>
MathWorks: Statistics Toolbox, The MathWorks, Inc., 2012a.
</mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>41</label><mixed-citation>
MathWorks: Bioinformatics Toolbox, The MathWorks, Inc., 2012b.
</mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>42</label><mixed-citation>
McCuen, R. H.: Modeling hydrologic change: statistical methods, CRC press,
448 pp., 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>43</label><mixed-citation>
McLachlan, G. and Krishnan, T.: The EM Algorithm and Extensions, 2nd Edition,
Wiley Series in Probability and Statistics, John Wiley &amp; Sons, Hoboken,
New Jersey, USA, 2008.
</mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>44</label><mixed-citation>
McLachlan, G. and Peel, D.: Finite Mixture Models, Wiley-Interscience, 1st
Edn., Printed United States 2000.
</mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>45</label><mixed-citation>
Mika, S., Rätsch, G., Weston, J., Schölkopf, B., and Müller, K.:
Fisher discriminant analysis with kernels, Proceedings of the 1999 IEEE
Neural Networks for Signal Processing, Madison, WI, 23–25 August 1999,
41–48, <a href="https://doi.org/10.1109/NNSP.1999.788121" target="_blank">https://doi.org/10.1109/NNSP.1999.788121</a>, 1999.
</mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>46</label><mixed-citation>
Mukerji, A., Chatterjee, C., and Raghuwanshi, N. S.: Flood Forecasting Using
ANN, Neuro-Fuzzy, and Neuro-GA Models, J. Hydrol. Eng., 14, 647–652,
<a href="https://doi.org/10.1061/(ASCE)HE.1943-5584.0000040" target="_blank">https://doi.org/10.1061/(ASCE)HE.1943-5584.0000040</a>, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>47</label><mixed-citation>
Nayak, P. C., Venkatesh, B., Krishna, B., and Jain, S. K.: Rainfall-runoff
modeling using conceptual, data driven, and wavelet based computing approach,
J. Hydrol., 493, 57–67, <a href="https://doi.org/10.1016/j.jhydrol.2013.04.016" target="_blank">https://doi.org/10.1016/j.jhydrol.2013.04.016</a>, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>48</label><mixed-citation>
Neal, J., Keef, C., Bates, P., Beven, K., and Leedal, D.: Probabilistic flood
risk mapping including spatial dependence, Hydrol. Process., 27, 1349–1363,
<a href="https://doi.org/10.1002/hyp.9572" target="_blank">https://doi.org/10.1002/hyp.9572</a>, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>49</label><mixed-citation>
Nguyen, C. C., Gaume, E., and Payrastre, O.: Regional flood frequency
analyses involving extraordinary flood events at ungauged sites: further
developments and validations, J. Hydrol., 508, 385–396, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>50</label><mixed-citation>
Olivier, C., Jouzel, F., and Matouat, A. E.: Choice of the Number of
Component Clusters in Mixture Models by Information Criteria, Proceedings of
the Vision Interface'99, Trois-Rivieres, Quebec, Canada, 18–21 May 1999,
74–81, 1999.
</mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>51</label><mixed-citation>
Paalanen, P.: Bayesian classification using Gaussian mixture model and EM
estimation: implementations and comparisons, Technical Report, Department of
Information Technology, Lappeenranta University of Technology, 2004.
</mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>52</label><mixed-citation>
Papaioannou, G., Vasiliades, L., and Loukas, A.: Multi-criteria analysis
framework for potential flood prone areas mapping, Water Resour. Manage., 29,
399–418, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>53</label><mixed-citation>
Pulvirenti, L., Pierdicca, N., Chini, M., and Guerriero, L.: An algorithm for
operational flood mapping from Synthetic Aperture Radar (SAR) data using
fuzzy logic, Nat. Hazards Earth Syst. Sci., 11, 529–540,
<a href="https://doi.org/10.5194/nhess-11-529-2011" target="_blank">https://doi.org/10.5194/nhess-11-529-2011</a>, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>54</label><mixed-citation>
Radmehr, A. and Araghinejad, S.: Developing Strategies for Urban Flood
Management of Tehran City Using SMCDM and ANN, J. Comput. Civ. Eng., 28,
05014006, <a href="https://doi.org/10.1061/(ASCE)CP.1943-5487.0000360" target="_blank">https://doi.org/10.1061/(ASCE)CP.1943-5487.0000360</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>55</label><mixed-citation>
Reynaud, A. and Nguyen, M.-H.: Valuing Flood Risk Reductions, Environ. Model.
Assess., 21, 603–617, 10.1007/s10666-016-9500-z, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>56</label><mixed-citation>
Rezaeianzadeh, M., Tabari, H., Arabi Yazdi, A., Isik, S., and Kalin, L.:
Flood flow forecasting using ANN, ANFIS and regression models, Neural Comput.
Appl., 25, 25–37, <a href="https://doi.org/10.1007/s00521-013-1443-6" target="_blank">https://doi.org/10.1007/s00521-013-1443-6</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>57</label><mixed-citation>
Seckin, N., Cobaner, M., Yurtal, R., and Haktanir, T.: Comparison of
Artificial Neural Network Methods with L-moments for Estimating Flood Flow at
Ungauged Sites: the Case of East Mediterranean River Basin, Turkey, Water
Resour. Manage., 27, 2103–2124, <a href="https://doi.org/10.1007/s11269-013-0278-3" target="_blank">https://doi.org/10.1007/s11269-013-0278-3</a>, 2013a.
</mixed-citation></ref-html>
<ref-html id="bib1.bib58"><label>58</label><mixed-citation>
Seckin, N., Cobaner, M., Yurtal, R., and Haktanir, T.: Comparison of
artificial neural network methods with L-moments for estimating flood flow at
ungauged sites: the case of East Mediterranean River Basin, Turkey, Water
Resour. Manage., 27, 2103–2124, 2013b.
</mixed-citation></ref-html>
<ref-html id="bib1.bib59"><label>59</label><mixed-citation>
Tehrany, M. S., Pradhan, B., and Jebur, M. N.: Flood susceptibility mapping
using a novel ensemble weights-of-evidence and support vector machine models
in GIS, J. Hydrol., 512, 332–343, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib60"><label>60</label><mixed-citation>
Tehrany, M. S., Pradhan, B., and Jebur, M. N.: Flood susceptibility analysis
and its verification using a novel ensemble support vector machine and
frequency ratio method, Stoch. Env. Res. Risk A, 29, 1149–1165,
<a href="https://doi.org/10.1007/s00477-015-1021-9" target="_blank">https://doi.org/10.1007/s00477-015-1021-9</a>, 2015a.
</mixed-citation></ref-html>
<ref-html id="bib1.bib61"><label>61</label><mixed-citation>
Tehrany, M. S., Pradhan, B., Mansor, S., and Ahmad, N.: Flood susceptibility
assessment using GIS-based support vector machine model with different kernel
types, CATENA, 125, 91–101, <a href="https://doi.org/10.1016/j.catena.2014.10.017" target="_blank">https://doi.org/10.1016/j.catena.2014.10.017</a>, 2015b.
</mixed-citation></ref-html>
<ref-html id="bib1.bib62"><label>62</label><mixed-citation>
Theodoridis, S.: Machine Learning: A Bayesian and Optimization Perspective,
Academic Press, Elsevier, USA, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib63"><label>63</label><mixed-citation>
Theodoridis, S. and Koutroumbas, K.: Pattern Recognition, Academic Press,
Elsevier Inc., USA, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib64"><label>64</label><mixed-citation>
Tien Bui, D., Le, K.-T., Nguyen, V., Le, H., and Revhaug, I.: Tropical Forest
Fire Susceptibility Mapping at the Cat Ba National Park Area, Hai Phong City,
Vietnam, Using GIS-Based Kernel Logistic Regression, Remote Sens., 8, 347,
2016a.
</mixed-citation></ref-html>
<ref-html id="bib1.bib65"><label>65</label><mixed-citation>
Tien Bui, D., Nguyen, Q. P., Hoang, N.-D., and Klempe, H.: A novel fuzzy
K-nearest neighbor inference model with differential evolution for spatial
prediction of rainfall-induced shallow landslides in a tropical hilly area
using GIS, Landslides, 14, 1–17, <a href="https://doi.org/10.1007/s10346-016-0708-4" target="_blank">https://doi.org/10.1007/s10346-016-0708-4</a>, 2016b.
</mixed-citation></ref-html>
<ref-html id="bib1.bib66"><label>66</label><mixed-citation>
Tien Bui, D., Pradhan, B., Nampak, H., Bui, Q.-T., Tran, Q.-A., and Nguyen,
Q.-P.: Hybrid artificial intelligence approach based on neural fuzzy
inference model and metaheuristic optimization for flood susceptibilitgy
modeling in a high-frequency tropical cyclone area using GIS, J. Hydrol.,
540, 317–330, <a href="https://doi.org/10.1016/j.jhydrol.2016.06.027" target="_blank">https://doi.org/10.1016/j.jhydrol.2016.06.027</a>, 2016c.
</mixed-citation></ref-html>
<ref-html id="bib1.bib67"><label>67</label><mixed-citation>
Tien Bui, D., Pradhan, B., Nampak, H., Quang Bui, T., Tran, Q.-A., and
Nguyen, Q. P.: Hybrid Artificial Intelligence Approach Based on Neural Fuzzy
Inference Model and Metaheuristic Optimization for Flood Susceptibility
Modelling in A High-Frequency Tropical Cyclone Area using GIS, J. Hydrol.,
540, 317–330, <a href="https://doi.org/10.1016/j.jhydrol.2016.06.027" target="_blank">https://doi.org/10.1016/j.jhydrol.2016.06.027</a>, 2016d.
</mixed-citation></ref-html>
<ref-html id="bib1.bib68"><label>68</label><mixed-citation>
Tien Bui, D., Tuan, T. A., Klempe, H., Pradhan, B., and Revhaug, I.: Spatial
prediction models for shallow landslide hazards: a comparative assessment of
the efficacy of support vector machines, artificial neural networks, kernel
logistic regression, and logistic model tree, Landslides, 13, 361–378,
<a href="https://doi.org/10.1007/s10346-015-0557-6" target="_blank">https://doi.org/10.1007/s10346-015-0557-6</a>, 2016e.
</mixed-citation></ref-html>
<ref-html id="bib1.bib69"><label>69</label><mixed-citation>
Tien Bui, D., Bui, Q.-T., Nguyen, Q.-P., Pradhan, B., Nampak, H., and Trinh,
P. T.: A hybrid artificial intelligence approach using GIS-based neural-fuzzy
inference system and particle swarm optimization for forest fire
susceptibility modeling at a tropical area, Agr. Forest Meteorol., 233,
32–44, <a href="https://doi.org/10.1016/j.agrformet.2016.11.002" target="_blank">https://doi.org/10.1016/j.agrformet.2016.11.002</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib70"><label>70</label><mixed-citation>
van Erkel, A. R. and Pattynama, P. M. T.: Receiver operating characteristic
(ROC) analysis: Basic principles and applications in radiology, Eur. J.
Radiol., 27, 88–94, <a href="https://doi.org/10.1016/S0720-048X(97)00157-5" target="_blank">https://doi.org/10.1016/S0720-048X(97)00157-5</a>, 1998.
</mixed-citation></ref-html>
<ref-html id="bib1.bib71"><label>71</label><mixed-citation>
Wallace, C. S. and Dowe, D. L.: Minimum Message Length and Kolmogorov
Complexity, Comput. J., 42, 270–283, <a href="https://doi.org/10.1093/comjnl/42.4.270" target="_blank">https://doi.org/10.1093/comjnl/42.4.270</a>, 1999.
</mixed-citation></ref-html>
<ref-html id="bib1.bib72"><label>72</label><mixed-citation>
Webb , A. R. and Copsey, K. D.: Statistical Pattern Recognition, John Wiley
&amp; Sons, UK, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib73"><label>73</label><mixed-citation>
Winsemius, H. C., Van Beek, L. P. H., Jongman, B., Ward, P. J., and Bouwman,
A.: A framework for global river flood risk assessments, Hydrol. Earth Syst.
Sci., 17, 1871–1892, <a href="https://doi.org/10.5194/hess-17-1871-2013" target="_blank">https://doi.org/10.5194/hess-17-1871-2013</a>, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib74"><label>74</label><mixed-citation>
Winsemius, H. C., Aerts, J. C., van Beek, L. P., Bierkens, M. F., Bouwman,
A., Jongman, B., Kwadijk, J. C., Ligtvoet, W., Lucas, P. L., and van Vuuren,
D. P.: Global drivers of future river flood risk, Nat. Clim. Change, 6,
381–385, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib75"><label>75</label><mixed-citation>
Yu, J.: Localized Fisher discriminant analysis based complex chemical process
monitoring, AICHE J., 57, 1817–1828, 2011.

</mixed-citation></ref-html>
<ref-html id="bib1.bib76"><label>76</label><mixed-citation>
Yue, S., Ouarda, T., Bobée, B., Legendre, P., and Bruneau, P.: The Gumbel
mixed model for flood frequency analysis, J. Hydrol., 226, 88–100, 1999.
</mixed-citation></ref-html>
<ref-html id="bib1.bib77"><label>77</label><mixed-citation>
Zhang, G., Mahfouf, M., Abdulkareem, M., Gaffour, S.-A., Yang, Y.-Y.,
Obajemu, O., Yates, J., Soberanis, S. A., and Pinna, C.: Hybrid-modelling of
compact tension energy in high strength pipeline steel using a Gaussian
Mixture Model based error compensation, Appl. Soft Comput., 48, 1–12,
<a href="https://doi.org/10.1016/j.asoc.2016.06.007" target="_blank">https://doi.org/10.1016/j.asoc.2016.06.007</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib78"><label>78</label><mixed-citation>
Zhou, Z., Liu, S., Zhong, G., and Cai, Y.: Flood Disaster and Flood Control
Measurements in Shanghai, Nat. Hazards Rev., 18, B5016001,
<a href="https://doi.org/10.1061/(ASCE)NH.1527-6996.0000213" target="_blank">https://doi.org/10.1061/(ASCE)NH.1527-6996.0000213</a>, 2016.
</mixed-citation></ref-html>--></article>
