Preprints
https://doi.org/10.5194/gmd-2022-206
https://doi.org/10.5194/gmd-2022-206
Submitted as: model evaluation paper
 | 
21 Nov 2022
Submitted as: model evaluation paper |  | 21 Nov 2022
Status: a revised version of this preprint is currently under review for the journal GMD.

Modeling river water temperature with limiting forcing data: air2stream v1.0.0, machine learning and multiple regression

Manuel C. Almeida and Pedro S. Coelho

Abstract. The prediction of river water temperature (WT) is of key importance in the field of environmental science. Water temperature datasets for low order rivers are often in short supply, leaving lake/reservoir water quality modelers with the challenge of extracting as much information as possible from existing datasets, usually without the use of physically based models, due to the significant amount of data required (e.g., river morphology, degree of shading, wind velocity). In this study, five models are used to predict the water temperature of 83 rivers (with 98 % missing data): three machine-learning (ML) algorithms (Random Forest, Artificial Neural Network and Support Vector Regression), the hybrid Air2stream model with all available parameterizations and a Multiple Regression. The machine learning hyperparameters were optimized with a Tree-structured Parzen Estimators algorithm and the results of each model are presented as an ensemble of 12 individual optimized model runs. The meteorological datasets were obtained from the fifth-generation atmospheric reanalysis, ERA5. In general terms, the results of the study demonstrate the vital importance of hyperparameter optimization and suggest that, from a practical modeling perspective, when the number of predictor variables and observed river WT values are limited, the application of all the models considered in this study is relevant (models ensemble mean annual – Root mean square error (RMSE): 2.75 ºC ± 1.00; Nash-Sutcliffe efficiency (NSE): 0.56 ± 0.48). The model that performed best was Random Forest (annual mean - RMSE: 3.18 ºC ± 1.06; NSE: 0.52 ± 0.23). The results also revealed the existence of a logarithmic correlation among the RMSE between the observed and predicted river WT and the watershed time of concentration. The RMSE increases by an average of 0.1 ºC with a one-hour increase in the watershed time of concentration. (watershed area: μ= 106 km2; σ=153).

Manuel C. Almeida and Pedro S. Coelho

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on gmd-2022-206', Anonymous Referee #1, 29 Dec 2022
    • AC1: 'Reply on RC1', Manuel Almeida, 18 Jan 2023
  • RC2: 'Comment on gmd-2022-206', Anonymous Referee #2, 13 Jan 2023
    • AC2: 'Reply on RC2', Manuel Almeida, 18 Jan 2023

Manuel C. Almeida and Pedro S. Coelho

Manuel C. Almeida and Pedro S. Coelho

Viewed

Total article views: 347 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
267 63 17 347 5 4
  • HTML: 267
  • PDF: 63
  • XML: 17
  • Total: 347
  • BibTeX: 5
  • EndNote: 4
Views and downloads (calculated since 21 Nov 2022)
Cumulative views and downloads (calculated since 21 Nov 2022)

Viewed (geographical distribution)

Total article views: 340 (including HTML, PDF, and XML) Thereof 340 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 25 Mar 2023
Download
Short summary
Water temperature (WT) datasets of low order rivers are commonly scarce. In this study, five different models are used to predict the WT of 83 rivers. Generally, the results show that the models hyperparameter optimization is essential and that to minimize the prediction error it is relevant to apply all the models considered in this study. Results also show that there is a logarithmic correlation among the error of the predicted river WT and the watersheds time of concentration.