Nonparametric-based estimation method for river cross-sections

Abstract. Aerial surveying with unmanned aerial vehicles (UAVs) has been popularly employed in river management and flood monitoring. One of the major processes in UAV aerial surveying for river applications is to demarcate the cross-section of a river. From the photo images of aerial surveying, a point cloud dataset can be abstracted with the structure from motion (SfM) technique. To accurately demarcate the cross-section from the cloud points, an appropriate delineation technique is required to reproduce the characteristics of natural and manmade channels, including abrupt changes, bumps, and lined shapes, even though the basic shape of natural and manmade channels is a trapezoidal shape. Therefore, a nonparametric-based estimation technique, called the K-nearest neighbor local linear regression (KLR) model, was tested in the current study to demarcate the cross-section of a river with a point cloud dataset from aerial surveying. The proposed technique was tested with a simulated dataset based on trapezoidal channels and compared with the traditional polynomial regression model and another nonparametric technique, locally weighted scatterplot smoothing (LOWESS). Furthermore, the KLR model was applied to a real case study in the Migok-cheon stream, South Korea. The results indicate that the proposed KLR model can be a suitable alternative for demarcating the cross-section of a river with point cloud data from UAV aerial surveying by reproducing the critical characteristics of natural and manmade channels, including abrupt changes and small bumps, as well as the overall trapezoidal shape.



24
Aerial surveying with unmanned aerial vehicles (UAVs) has been popularly employed in river 25 management and flood monitoring. One of the major processes in UAV aerial surveying for river 26 applications is to demarcate the cross-section of a river. From the photo images of aerial surveying, 27 a point cloud dataset can be abstracted with the structure from motion (SfM) technique. To 28 accurately demarcate the cross-section from the cloud points, an appropriate delineation technique 29 is required to reproduce the characteristics of natural and manmade channels, including abrupt 30 changes, bumps, and lined shapes, even though the basic shape of natural and manmade channels 31 is a trapezoidal shape. Therefore, a nonparametric-based estimation technique, called the K-nearest 32 neighbor local linear regression (KLR) model, was tested in the current study to demarcate the dense point cloud can be determined from the SfM. These point clouds are converted from an 68 arbitrary coordinate system to a geographical coordinate system with camera position and focal 69 length information or by associating reference points on the ground, called ground control points 70 (GCPs), with known coordinates. A point cloud is a set of 3-dimensional points located in space. 71 The 3D locations of a point cloud can be determined from a sensor by emitting pulses and 72 calculating them with the position of the sensor and the pulse direction. Here, the sensor refers to 73 a photogrammetry camera in the current study. 74 In UAV aerial surveying applications for river management and flood analysis, the 75 demarcation of the cross-section of a river is the critical process. Accurate demarcation of the 76 cross-section is mostly required to calculate the peak discharge and flow amount. However, the 77 dense cloud point dataset obtained from UAV aerial surveying and the SfM technique mostly 78 contains errors and does not provide direct cross-sectional information. An appropriate technique 79 to demarcate the cross-section from the point cloud dataset is necessary to develop. 80 The demarcation of the cross-section in a river has been mostly made with a digital elevation 81 model (DEM) in the literature (Gichamo et al., 2012;Petikas et al., 2020a, b;Pilotti, 2016 Therefore, the current study proposes a demarcation technique for river cross-sections from 89 the point clouds of UAV aerial surveying. A cross-section of natural rivers contains abrupt changes 90 and small bumps as well as smooth variations even though it normally has a trapezoidal shape. 91 The demarcation technique must reproduce the characteristics of natural rivers as well as the abrupt 92 changes in a manmade channel. The proposed demarcation model was tested to determine whether 93 to reproduce the characteristics of natural river and manmade channels. 94 95

96
With the point cloud data obtained from UAV aerial surveying and postprocessing, the river 97 cross-section must be demarcated. Polynomial regression can be simply applied to the point data. 98 However, the fixed shape of the polynomial regression along with its order is limited to the highly 99 varied shape of the cross-section. Therefore, a nonparametric regression approach is adopted in 100 the current study, especially K-nearest neighbor local regression. A detailed description of 101 polynomial regression and the nonparametric regression model is shown in the following. 102

103
A polynomial regression model can be used when the relationship between a predictor (x) and an 104 explanatory variable (y) is nonlinear or curvilinear. The k th -order polynomial regression can be 105 expressed as 106 where is considered to be random noise with zero mean.

KNN-based Local Linear Regression (KLR)
109 Assume that the current condition of the predictors xt with the observed data pairs ( , ), for i = 110 1,…,n, is given for the n number of data points (i.e., the selected cloud points). The number of 111 neighbors (k) is also assumed to be known. The predictor Yt is estimated according to the following where a is a multiplier and is a positive integer (i.e., 1,2,3,4). 140 As noted, only the partial dataset is employed for the observations rather than the whole 141 observation dataset, unlike other regressions. For the point cloud dataset from UAV photography, 142 this proposed approach in the current study is highly advantageous since the neighboring data point 143 is sufficient and the fitting of the target point must not be affected by the points that are far away 144 from the target point. This advantage is further discussed in the results section. 145

146
The performance of the KLR model in fitting the point cloud data for river cross-sections is tested 147 with the simulated point cloud data. 148

149
A river cross-section is generally trapezoidal due to maximum discharge and easy 150 construction (Chow, 1959). Therefore, a trapezoidal channel was assumed with a 4 m top at both 151 sides and a 6 m base width as well as a 1:1 side slope with a 6 m height, as shown by the thick 152 solid blue line of Figure 1. The channel points were assumed to be measured with 0.1 m intervals, 153 for a total of 161 points. It is assumed that these points work as cloud points that UAV cameras 154 might capture in aerial surveying. The assumed cloud point dataset was generated based on the 155 assumed 161 points (see the thick solid blue line in Figure 2), as follows: 156 where Y is the assumed points, and ~(0, σ 2 ), i.e., normally distributed error. Note that the 158 generated data (Z) are presented with red circles in Figure 1. 159 In the current study, 2 =0.2 was used following similar variability of observed data after 160 testing several values. The magnitude of this error variance ( 2 ) represents the differences in the 161 photo locations for the same cloud point of the real ground location (i.e., Y in this case). High 162 variance indicates that extracted point clouds include high errors, and vice versa. 163 In Figure 1 and Figure 2, the simulated data are presented with red circles. The number of 164 simulated data points was chosen to be 2 times and 10 times the assumed 161 points that were 165 applied for the assumable measured trapezoid line (i.e., 322 and 1610 points), as shown in Figure  166 1 and Figure 2, respectively. Note that the recommended overlap is 70-80% frontal and 60% side 167 in general cases. In this overlapping case, each cross-section point might be captured 168 approximately 10 times. Therefore, the number of simulated data points is set to 10 times the 169 number of trapezoidal channel points (a total of 1610), as shown in Figure 2. Additionally, there 170 are some portions in which overlapping might not be achieved. Minimal overlap to be a point cloud 171 is at least 2 times, and 2 times the channel points were also tested, as shown in Figure 1. 172

173
In Figure 1 can be tested. However, the simulation study with the trapezoid channel that is similar to the real 248 river cross-section shows that the presented KLR nonparametric model is suitable for demarcating 249 the cross-section of a river. The major reason for the good performance is that the KLR model 250 employs only k-nearest neighbor observations. This approach might not be beneficial when an 251 overall trend is needed and not enough observations are available. However, the point cloud data 252 taken from UAV aerial surveying often provide a large enough number of points in the data set. 253 Furthermore, the shape of the cross-sections in a natural river is irregular, and abrupt changes can 254 be easily observed. This feature can be captured only through fitting nearby observations. 255 Therefore, the KLR model might be a suitable alternative to demarcating the cross-section of a 256 river with the cloud point dataset. 257  The study area is located in the Migok-cheon stream flowing through Hapcheon-gun, South 261

Study Area and Data Acquisition
Korea, as shown in Figure 7. The Migok-cheon stream has an 8.8 km length and 13.9 km 2 262 watershed area. The slope of the stream is approximately 1/50~1/400, and the study area has a 263 slope of 1/350. This stream conjuncts to the Hwanggang River at the end of the stream, and the 264 Hwanggang River is joined into the Nakdong River directly afterward; the Nakdong River is one 265 of the four largest rivers in South Korea. Therefore, the Migok-cheon stream is highly affected by 266 the water levels of the Hwanggang River and Nakdong River.

296
The point cloud data from UAV photography are presented with Transverse Mercator (TM) 297 projection for x, y, and z. The TM projection is a conformal projection presented by Lambert in 298 1772. To demarcate a cross section of a river, the point cloud data must be projected to a new 299 coordinate system. 300 As an example in Figure 8  The two tested sites in the Migok-cheon stream are presented in Figure 9. The overall 322 produced point cloud dataset for the UAV surveying area is presented in the left panel of Figure 9, 323 and the picture of the left panel consists of only the collected points. Site-1 is located in the middle 324 of the study area, while Site-2 is in the upper part of the area. Since the nearby area of Site-1 is 325 located in the middle of the UAV surveying coverage, several images can be overleaped and 326 captured for the same points. 327 Therefore, the number of points for demarcating a cross-section of the river might be 328 sufficient to capture the detailed characteristics of the cross-section (see the top-right panel of 329 Figure 9). In contrast, Site-2 is located at the upper part of the coverage area, and overlapping 330 images might be limited, which indicates that the number of points to capture a target cross-section 331 is also limited. Furthermore, a part of the cross-sectional area can be missing due to technical and 332 environmental limitations such as waterbodies and insufficient overlapping images. For example, 333 there are some areas in which no cloud point data exist, as on the right side of Site -1. This point 334 is intentionally selected to verify the model performance in such a case. 335

Demarcation of the selected cross-sections 336
The demarcated cross-sections for the selected sites (i.e., Site-1 and Site 2) are presented in 337 Figure 10 and Figure 11, respectively. In Figure 10 The cross-section of Site-2 is presented in Figure 11  The estimated ( ) and ( ) are presented in Figure 12 and Figure 13 for Site-1 and Site-366 2, respectively. The area of Site-1 is exponentially increased, while the perimeter is increased at 367 different steps according to the heights, as shown in Figure 12. As seen in Figure 10, the width of 368 the cross-section increases as the height increases, and this feature affects the exponential increase 369 in the area as the height increases. At heights between 17.0 m and 18.5 m, the perimeter increases 370 rapidly as the height increases, as shown in the lower panel of Figure 12, since the shape of the 371 cross-section is rather flat in this range of heights, as shown in Figure 10. The other part of the 372 perimeter increases linearly except for a higher slope between 21 and 21.5 since the width becomes 373 wider in this range of heights. Note that the area and perimeter outside the bank is excluded because 374 it is between 0 and approximately 10 m and over 37 m in the x-coordinate of Figure 11. 375 Similar features of the area and perimeter along with the height to Site-1 can be observed at 376 those of Site-2, as shown in Figure 13. This exponential and S-shaped increase in the area and 377 perimeter is a typical characteristic at the cross-section of natural rivers and trapezoidal channels. 378 The results of the area and perimeter show that the proposed KLR method can be a reasonable 379 alternative in demarcating the cross-section of a river obtained from a point cloud dataset. 380 381

382
The current study presents a nonparametric fitting method, the KLR, to the point cloud data 383 from UAV areal surveying to demarcate the cross-section of a river. Other than general fitting data, 384 the cross-section of a natural river generally contains sudden variation, an angled shape, and even 385 bumps as well as a linear shape even though the overall shape of a cross-section for a river is 386 trapezoidal. To accommodate all of those features of the natural cross-section, a highly flexible 387 fitting model is requested. Furthermore, the observed datae point is large enough for the point 388 cloud dataset. Therefore, the KLR model was chosen to fit the point cloud data for the cross-section. 389 The results conclude that the tested KLR model can reproduce the critical characteristics of the 390 cross-section of natural rivers with the point cloud data from UAV aerial surveying. 391 The major limitation of the point cloud data employed in the current study is that RGB 392 photographs were employed and the vegetation inside the river could generate an obscure cross-393     589 Figure 9. Two tested Sites in the Migok-cheon stream. Note that the right panels magnify the 590 tested sites by showing the point clouds of the observed data taken from the UAV photographs. 591 The aerial images were taken from the authors.