Articles | Volume 15, issue 7
https://doi.org/10.5194/gmd-15-3021-2022
© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/gmd-15-3021-2022
© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
AI4Water v1.0: an open-source python package for modeling hydrological time series using data-driven methods
Ather Abbas
Urban and Environmental Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
Laurie Boithias
Géosciences Environnement Toulouse, Université de Toulouse, CNRS, IRD, UPS, 31400 Toulouse, France
Yakov Pachepsky
Environmental Microbial and Food Safety Laboratory, USDA-ARS, Beltsville, MD, USA
Kyunghyun Kim
Watershed and Total Load Management Research Division, National Institute of Environmental Research, Hwangyeong-ro 42, Seogu, Incheon 22689, Republic of Korea
Climate Research Department, APEC Climate Center, Busan, Republic of Korea
Kyung Hwa Cho
CORRESPONDING AUTHOR
Urban and Environmental Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
Related authors
Ather Abbas, Yuan Yang, Ming Pan, Yves Tramblay, Chaopeng Shen, Haoyu Ji, Solomon H. Gebrechorkos, Florian Pappenberger, Jong Cheol Pyo, Dapeng Feng, George Huffman, Phu Nguyen, Christian Massari, Luca Brocca, Tan Jackson, and Hylke E. Beck
EGUsphere, https://doi.org/10.5194/egusphere-2024-4194, https://doi.org/10.5194/egusphere-2024-4194, 2025
Short summary
Short summary
Our study evaluated 23 precipitation datasets using a hydrological model at global scale to assess their suitability and accuracy. We found that MSWEP V2.8 excels due to its ability to integrate data from multiple sources, while others, such as IMERG and JRA-3Q, demonstrated strong regional performances. This research assists in selecting the appropriate dataset for applications in water resource management, hazard assessment, agriculture, and environmental monitoring.
Ather Abbas, Yuan Yang, Ming Pan, Yves Tramblay, Chaopeng Shen, Haoyu Ji, Solomon H. Gebrechorkos, Florian Pappenberger, Jong Cheol Pyo, Dapeng Feng, George Huffman, Phu Nguyen, Christian Massari, Luca Brocca, Tan Jackson, and Hylke E. Beck
EGUsphere, https://doi.org/10.5194/egusphere-2024-4194, https://doi.org/10.5194/egusphere-2024-4194, 2025
Short summary
Short summary
Our study evaluated 23 precipitation datasets using a hydrological model at global scale to assess their suitability and accuracy. We found that MSWEP V2.8 excels due to its ability to integrate data from multiple sources, while others, such as IMERG and JRA-3Q, demonstrated strong regional performances. This research assists in selecting the appropriate dataset for applications in water resource management, hazard assessment, agriculture, and environmental monitoring.
Daeha Kim, Minha Choi, and Jong Ahn Chun
Hydrol. Earth Syst. Sci., 26, 5955–5969, https://doi.org/10.5194/hess-26-5955-2022, https://doi.org/10.5194/hess-26-5955-2022, 2022
Short summary
Short summary
We proposed a practical method that predicts the evaporation rates on land surfaces (ET) where only atmospheric data are available. Using a traditional equation that describes partitioning of precipitation into ET and streamflow, we could approximately identify the key parameter of the predicting formulation based on land–atmosphere interactions. The simple method conditioned by local climates outperformed sophisticated models in reproducing water-balance estimates across Australia.
Laurie Boithias, Olivier Ribolzi, Emma Rochelle-Newall, Chanthanousone Thammahacksa, Paty Nakhle, Bounsamay Soulileuth, Anne Pando-Bahuon, Keooudone Latsachack, Norbert Silvera, Phabvilay Sounyafong, Khampaseuth Xayyathip, Rosalie Zimmermann, Sayaphet Rattanavong, Priscia Oliva, Thomas Pommier, Olivier Evrard, Sylvain Huon, Jean Causse, Thierry Henry-des-Tureaux, Oloth Sengtaheuanghoung, Nivong Sipaseuth, and Alain Pierret
Earth Syst. Sci. Data, 14, 2883–2894, https://doi.org/10.5194/essd-14-2883-2022, https://doi.org/10.5194/essd-14-2883-2022, 2022
Short summary
Short summary
Fecal pathogens in surface waters may threaten human health, especially in developing countries. The Escherichia coli (E. coli) database is organized in three datasets and includes 1602 records from 31 sampling stations located within the Mekong River basin in Lao PDR. Data have been used to identify the drivers of E. coli dissemination across tropical catchments, including during floods. Data may be further used to interpret new variables or to map the health risk posed by fecal pathogens.
Ather Abbas, Sangsoo Baek, Norbert Silvera, Bounsamay Soulileuth, Yakov Pachepsky, Olivier Ribolzi, Laurie Boithias, and Kyung Hwa Cho
Hydrol. Earth Syst. Sci., 25, 6185–6202, https://doi.org/10.5194/hess-25-6185-2021, https://doi.org/10.5194/hess-25-6185-2021, 2021
Short summary
Short summary
Correct estimation of fecal indicator bacteria in surface waters is critical for public health. Process-driven models and recently data-driven models have been applied for water quality modeling; however, a systematic comparison for simulation of E. coli is missing in the literature. We compared performance of process-driven (HSPF) and data-driven (LSTM) models for E. coli simulation. We show that LSTM can be an alternative to process-driven models for estimation of E. coli in surface waters.
Daeha Kim, Minha Choi, and Jong Ahn Chun
Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2021-126, https://doi.org/10.5194/hess-2021-126, 2021
Revised manuscript not accepted
Short summary
Short summary
This work evaluate a convenient operational method to simulate evaporation over dry land surfaces across Australia. While this chosen method based on the responsive behavior of atmospheric water demand outperformed commonly-used sophisticated models in predicting evaporation in the United States and China, it showed some poor performance in wet river basins Australia. Yet, its performance was still good under (semi-)arid climates.
Cited articles
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A.,
Dean, J., Devin, M., Ghemawat, S., Irving, G., and Isard, M.:
Tensorflow: A system for large-scale machine learning, in: 12th
{USENIX} symposium on operating systems design and implementation
({OSDI} 16), 265–283, 2016.
Abbas, A., Baek, S., Kim, M., Ligaray, M., Ribolzi, O., Silvera, N., Min, J.-H., Boithias, L., and Cho, K. H.: Surface and sub-surface flow estimation at high temporal resolution using deep neural networks, J. Hydrol., 590, 125370, https://doi.org/10.1016/j.jhydrol.2020.125370, 2020.
Abbas, A., Iftikhar, S., and Kwon, D.: AtrCheema/AI4Water: AI4Water v1.0: An open source python package for modeling hydrological time series using data-driven methods (v1.0-beta.1), Zenodo [data set and code], https://doi.org/10.5281/zenodo.5595680, 2021.
Addor, N., Newman, A. J., Mizukami, N., and Clark, M. P.: The CAMELS data set: catchment attributes and meteorology for large-sample studies, Hydrol. Earth Syst. Sci., 21, 5293–5313, https://doi.org/10.5194/hess-21-5293-2017, 2017.
Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M.: Optuna: A next-generation hyperparameter optimization framework, in: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, 2623–2631, 2019.
Allen, R. G., Pereira, L. S., Raes, D., and Smith, M.: Crop evapotranspiration-Guidelines for computing crop water requirements-FAO Irrigation and drainage paper 56, Fao, Rome, 300, 1998.
Bergstra, J., Bardenet, R., Bengio, Y., and Kégl, B.: Algorithms for hyper-parameter optimization, Adv. Neur. In., 24, 2546–2554, 2011.
Bergstra, J., Yamins, D., and Cox, D.: Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures, in: International conference on machine learning, 115–123, 2013.
Bicknell, B. R., Imhoff, J. C., Kittle Jr., J. L., Donigian
Jr., A. S., and Johanson, R. C.: Hydrological simulation program – FORTRAN user's manual for version 11, Environmental Protection Agency Report No. EPA/600/R-97/080, US Environmental Protection Agency, Athens, GA, 1997.
Boithias, L., Auda, Y., Audry, S., Bricquet, J. P., Chanhphengxay, A., Chaplot, V., de Rouw, A., Henry des Tureaux, T., Huon, S., and Janeau, J. L.: The Multiscale TROPIcal CatchmentS critical zone observatory M-TROPICS dataset II: land use, hydrology and sediment production monitoring in Houay Pano, northern Lao PDR, Hydrol. Process., 35, e14126, https://doi.org/10.1002/hyp.14126, 2021.
Botchkarev, A.: Performance metrics (error measures) in
machine learning regression, forecasting and prognostics: Properties
and typology, arXiv [preprint],
arXiv:1809.03006,
2018.
Brandl, G.: Sphinx documentation,
http://sphinx-doc.org/sphinx.pdf (last access: 18 March 2022), 2010.
Burns, D. M. and Whyne, C. M.: Seglearn: A python package for learning sequences and time series, J. Mach. Learn. Res., 19, 3238–3244, 2018.
Candès, E. J. and Recht, B.: Exact matrix completion via convex optimization, Found. Comput. Math., 9, 717–772, 2009.
Chakraborty, M., Sarkar, S., Mukherjee, A., Shamsudduha, M., Ahmed, K. M., Bhattacharya, A., and Mitra, A.: Modeling regional-scale groundwater arsenic hazard in the transboundary Ganges River Delta, India and Bangladesh: Infusing physically-based model with machine learning, Sci. Total Environ., 748, 141107, https://doi.org/10.1016/j.scitotenv.2020.141107, 2020.
Chen, H., Zhang, X., Liu, Y., and Zeng, Q.: Generative adversarial networks capabilities for super-resolution reconstruction of weather radar echo images, Atmosphere, 10, 555, https://doi.org/10.3390/atmos10090555, 2019.
Chen, K., Chen, H., Zhou, C., Huang, Y., Qi, X., Shen, R., Liu, F., Zuo, M., Zou, X., and Wang, J.: Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data, Water Res., 171, 115454, https://doi.org/10.1016/j.watres.2019.115454, 2020.
Chen, T. and Guestrin, C.: Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785–794, https://doi.org/10.1145/2939672.2939785, 2016.
Chen, X., Yang, J., and Sun, L.: A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation, Transport. Res. C-Emer., 117, 102673, https://doi.org/10.1016/j.trc.2020.102673, 2020.
Cheng, Y., Li, D., Guo, Z., Jiang, B., Lin, J., Fan, X., Geng, J., Yu, X., Bai, W., and Qu, L.: Dlbooster: Boosting end-to-end deep learning workflows with offloading data preprocessing pipelines, in: Proceedings of the 48th International Conference on Parallel Processing, 1–11, 2019.
Chollet, F.: Deep learning with Python, 1, Manning Publications Co., ISBN 9781617294433, 2018.
Christ, M., Braun, N., Neuffer, J., and Kempa-Liehr,
A. W.: Time series feature extraction on basis of scalable
hypothesis tests (tsfresh – a python package), Neurocomputing, 307, 72–77, 2018.
Collenteur, R. A., Bakker, M., Caljé, R., Klop, S. A., and Schaars, F.: Pastas: open source software for the analysis of groundwater time series, Groundwater, 57, 877–885, 2019.
Coxon, G., Addor, N., Bloomfield, J. P., Freer, J., Fry, M., Hannaford, J., Howden, N. J. K., Lane, R., Lewis, M., Robinson, E. L., Wagener, T., and Woods, R.: CAMELS-GB: hydrometeorological time series and landscape attributes for 671 catchments in Great Britain, Earth Syst. Sci. Data, 12, 2459–2483, https://doi.org/10.5194/essd-12-2459-2020, 2020.
Faouzi, J. and Janati, H.: pyts: A Python Package for Time Series Classification, J. Mach. Learn. Res., 21, 46:41–46:46, 2020.
Ferreira, L. B. and da Cunha, F. F.: New approach to estimate daily reference evapotranspiration based on hourly temperature and relative humidity using machine learning and deep learning, Agr. Water Manage., 234, 106113, https://doi.org/10.1016/j.agwat.2020.106113, 2020.
Fowler, K. J. A., Acharya, S. C., Addor, N., Chou, C., and Peel, M. C.: CAMELS-AUS: hydrometeorological time series and landscape attributes for 222 catchments in Australia, Earth Syst. Sci. Data, 13, 3847–3867, https://doi.org/10.5194/essd-13-3847-2021, 2021.
Freund, Y. and Schapire, R. E.: A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. Syst. Sci., 55, 119–139, 1997.
Friedman, J. H.: Greedy function approximation: a gradient boosting machine, Ann. Stat., 29, 1189–1232, https://doi.org/10.1214/aos/1013203451, 2001.
Geurts, P., Ernst, D., and Wehenkel, L.: Extremely randomized trees, Mach. Learn., 63, 3–42, 2006.
Guo, D., Westra, S., and Maier, H. R.: Impact of evapotranspiration process representation on runoff projections from conceptual rainfall-runoff models, Water Resour. Res., 53, 435–454, 2017.
Hastie, T., Mazumder, R., Lee, J. D., and Zadeh, R.: Matrix completion and low-rank SVD via fast alternating least squares, J. Mach. Learn. Res., 16, 3367–3402, 2015.
Head, T., MechCoder, G. L., and Shcherbatyi, I.: scikit-optimize/scikit-optimize: v0. 5.2, Zenodo [code], https://doi.org/10.5281/zenodo.5565057, 2018.
Ho, T. K.: The random subspace method for constructing decision forests, IEEE T. Pattern Anal., 20, 832–844, 1998.
Hochreiter, S. and Schmidhuber, J.: Long short-term memory, Neural Comput., 9, 1735–1780, 1997.
Huang, Y., Bárdossy, A., and Zhang, K.: Sensitivity of hydrological models to temporal and spatial resolutions of rainfall data, Hydrol. Earth Syst. Sci., 23, 2647–2663, https://doi.org/10.5194/hess-23-2647-2019, 2019.
Hutter, F., Hoos, H., and Leyton-Brown, K.: An efficient approach for assessing hyperparameter importance, in: International conference on machine learning, 754–762, 2014.
Hyndman, R. J.: Another look at forecast-accuracy metrics for intermittent demand, Foresight: The International Journal of Applied Forecasting, 4, 43–46, 2006.
Hyndman, R. J. and Koehler, A. B.: Another look at measures of forecast accuracy, Int. J. Forecasting, 22, 679–688, https://doi.org/10.1016/j.ijforecast.2006.03.001, 2006.
Jang, J., Abbas, A., Kim, M., Shin, J., Kim, Y. M., and Cho, K. H.: Prediction of antibiotic-resistance genes occurrence at a recreational beach with deep learning models, Water Res., 196, 117001, https://doi.org/10.1016/j.watres.2021.117001, 2021.
Jensen, M. E. and Haise, H. R.: Estimating evapotranspiration from solar radiation, Journal of the Irrigation and Drainage Division, 89, 15–41, 1963.
Karpatne, A., Atluri, G., Faghmous, J. H., Steinbach, M., Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N., and Kumar, V.: Theory-guided data science: A new paradigm for scientific discovery from data, IEEE T. Knowl. Data En., 29, 2318–2331, 2017.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y.: Lightgbm: A highly efficient gradient boosting decision tree, Adv. Neur. In., 30, 3146–3154, 2017.
Kim, M., Boithias, L., Cho, K. H., Sengtaheuanghoung, O., and Ribolzi, O.: Modeling the Impact of Land Use Change on Basin-scale Transfer of Fecal Indicator Bacteria: SWAT Model Performance, J. Environ. Qual., 47, 1115–1122, 2018.
Klingler, C., Schulz, K., and Herrnegger, M.: LamaH-CE: LArge-SaMple
DAta for Hydrology and Environmental Sciences for Central Europe,
Earth Syst. Sci. Data, 13, 4529–4565,
https://doi.org/10.5194/essd-13-4529-2021, 2021.
Kratzert, F., Herrnegger, M., Klotz, D., Hochreiter, S., and Klambauer, G.: NeuralHydrology–interpreting LSTMs in hydrology, in: Explainable AI: Interpreting, explaining and visualizing deep learning, Springer, 347–362, https://doi.org/10.48550/arXiv.1903.07903, 2019.
Lange, H. and Sippel, S.: Machine learning applications in hydrology, in: Forest-water interactions, Springer, 233–257, https://doi.org/10.1007/978-3-030-26086-6_10, 2020.
Leufen, L. H., Kleinert, F., and Schultz, M. G.: MLAir (v1.0) – a tool to enable fast and flexible machine learning on air data time series, Geosci. Model Dev., 14, 1553–1574, https://doi.org/10.5194/gmd-14-1553-2021, 2021.
Li, W., Kiaghadi, A., and Dawson, C.: High temporal resolution rainfall–runoff modeling using long-short-term-memory (LSTM) networks, Neural Comput. Appl., 33, 1261–1278, 2021.
Liaw, A. and Wiener, M.: Classification and regression by randomForest, R news, 2, 18–22, 2002.
Lim, B., Arık, S. Ö., Loeff, N., and Pfister, T.: Temporal fusion transformers for interpretable multi-horizon time series forecasting, Int. J. Forecast., 37, 1748–1764, https://doi.org/10.1016/j.ijforecast.2021.03.012 2021,
Löning, M., Bagnall, A., Ganesh, S., Kazakov, V., Lines, J., and Király, F. J.: sktime: A unified interface for machine learning with time series, arXiv [preprint], arXiv:1909.07872, 2019.
Lundberg, S. and Lee, S.-I.: An unexpected unity among methods for interpreting model predictions, arXiv [preprint], arXiv:1611.07478, 2016.
Lundberg, S. M. and Lee, S.-I.: A unified approach to interpreting model predictions, in: Proceedings of the 31st international conference on neural information processing systems, 4768–4777, 2017.
Lundberg, S. M., Erion, G. G., and Lee, S.-I.: Consistent individualized feature attribution for tree ensembles, arXiv [preprint], arXiv:1802.03888, 2018.
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., and Lee, S.-I.: From local explanations to global understanding with explainable AI for trees, Nature Machine Intelligence, 2, 56–67, https://doi.org/10.1038/s42256-019-0138-9, 2020.
Luo, Y., Cai, X., Zhang, Y., Xu, J., and Yuan, X.: Multivariate time series imputation with generative adversarial networks, in: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 1603–1614, 2018.
Mazumder, R., Hastie, T., and Tibshirani, R.: Spectral regularization algorithms for learning large incomplete matrices, J. Mach. Learn. Res., 11, 2287–2322, 2010.
McKinney, W.: pandas: a foundational Python library for data analysis and statistics, Python for High Performance and Scientific Computing, 14, 1–9, 2011.
Molino, P., Dudin, Y., and Miryala, S. S.: Ludwig: a type-based declarative deep learning toolbox, arXiv [preprint], arXiv:1909.07930, 2019.
Morton, F. I.: Operational estimates of areal evapotranspiration and their significance to the science and practice of hydrology, J. Hydrol., 66, 1–76, 1983.
Moshe, Z., Metzger, A., Elidan, G., Kratzert, F., Nevo, S., and El-Yaniv, R.: Hydronets: Leveraging river structure for hydrologic modeling, arXiv [preprint], arXiv:2007.00595, 2020.
Nakhle, P., Ribolzi, O., Boithias, L., Rattanavong, S., Auda, Y., Sayavong, S., Zimmermann, R., Soulileuth, B., Pando, A., and Thammahacksa, C.: Effects of hydrological regime and land use on in-stream Escherichia coli concentration in the Mekong basin, Lao PDR, Sci. Rep., 11, 1–17, 2021.
Neitsch, S. L., Arnold, J. G., Kiniry, J. R., and Williams, J. R.: Soil and water assessment tool theoretical documentation version 2009, Texas Water Resources Institute, https://swat.tamu.edu/docs/ (last access: 22 March 2022), 2011.
Ni, L., Wang, D., Wu, J., Wang, Y., Tao, Y., Zhang, J., and Liu, J.: Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model, J. Hydrol., 586, 124901, https://doi.org/10.1016/j.jhydrol.2020.124901, 2020.
Nourani, V., Sayyah-Fard, M., Alami, M. T., and Sharghi, E.: Data pre-processing effect on ANN-based prediction intervals construction of the evaporation process at different climate regions in Iran, J. Hydrol., 588, 125078, https://doi.org/10.1016/j.jhydrol.2020.125078, 2020.
Pandey, P. K. and Soupir, M. L.: Assessing the impacts of E. coli laden streambed sediment on E. coli loads over a range of flows and sediment characteristics, J. Am. Water Resour. As., 49, 1261–1269, https://doi.org/10.1038/s41598-017-12853-y, 2013.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L.: Pytorch: An imperative style, high-performance deep learning library, Adv. Neur. In., 32, 8026–8037, 2019.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., and Dubourg, V.: Scikit-learn: Machine learning in Python, J. Mach. Learn. Res., 12, 2825–2830, 2011.
Prestwich, S., Rossi, R., Armagan Tarim, S., and Hnich, B.: Mean-based error measures for intermittent demand forecasting, Int. J. Prod. Res., 52, 6782–6791, 2014.
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., and Gulin, A.: CatBoost: unbiased boosting with categorical features, arXiv [preprint], arXiv:1706.09516, 2017.
Qin, Y., Song, D., Chen, H., Cheng, W., Jiang, G., and Cottrell, G.: A dual-stage attention-based recurrent neural network for time series prediction, arXiv [preprint], arXiv:1704.02971, 2017.
Remesan, R. and Mathew, J.: Hydrological data driven modelling, Springer, https://doi.org/10.1007/978-3-319-09235-5, 2016.
Ribeiro, M. T., Singh, S., and Guestrin, C.: “Why should i trust you?” Explaining the predictions of any classifier, in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 1135–1144, 2016.
Rubinsteyn, A. and Feldman, S.: fancyimpute: A Variety of Matrix Completion and Imputation Algorithms Implemented in Python, Version 0.0, 16, Zenodo [code], https://doi.org/10.5281/zenodo.51773, 2016.
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead, Nature Machine Intelligence, 1, 206–215, https://doi.org/10.1038/s42256-019-0048-x, 2019.
Sang, Y.-F.: A review on the applications of wavelet transform in hydrology time series analysis, Atmos. Res., 122, 8–15, 2013.
Sang, Y.-F., Wang, D., Wu, J.-C., Zhu, Q.-P., and Wang, L.: The relation between periods' identification and noises in hydrologic series data, J. Hydrol., 368, 165–177, 2009.
Shahhosseini, M., Hu, G., Huber, I., and Archontoulis, S. V.: Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt, Sci. Rep., 11, 1–15, 2021.
Shortridge, J. E., Guikema, S. D., and Zaitchik, B. F.: Machine learning methods for empirical streamflow simulation: a comparison of model accuracy, interpretability, and uncertainty in seasonal watersheds, Hydrol. Earth Syst. Sci., 20, 2611–2628, https://doi.org/10.5194/hess-20-2611-2016, 2016.
Sit, M., Demiray, B. Z., Xiang, Z., Ewing, G. J., Sermet, Y., and Demir, I.: A comprehensive review of deep learning applications in hydrology and water resources, Water Sci. Technol., 82, 2635–2670, 2020.
Snoek, J., Larochelle, H., and Adams, R. P.: Practical bayesian optimization of machine learning algorithms, Adv. Neur. In., 25, https://doi.org/10.48550/arXiv.1206.2944, 2012.
Tavenard, R., Faouzi, J., Vandewiele, G., Divo, F., Androz, G., Holtz, C., Payne, M., Yurchak, R., Rußwurm, M., and Kolar, K.: Tslearn, A Machine Learning Toolkit for Time Series Data, J. Mach. Learn. Res., 21, 1–6, 2020.
Taylor, K. E.: Summarizing multiple aspects of model performance in a single diagram, J. Geophys. Res.-Atmos., 106, 7183–7192, 2001.
Thornton, C., Hutter, F., Hoos, H. H., and Leyton-Brown, K.: Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms, in: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 847–855, 2013.
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., and Altman, R. B.: Missing value estimation methods for DNA microarrays, Bioinformatics, 17, 520–525, 2001.
Wang, L., Chen, J., and Marathe, M.: Tdefsi: Theory-guided deep learning-based epidemic forecasting with synthetic information, ACM Transactions on Spatial Algorithms and Systems (TSAS), 6, 1–39, 2020.
Wheatcroft, E.: Interpreting the skill score form of forecast performance metrics, Int. J. Forecasting, 35, 573–579, 2019.
Zaharia, M., Chen, A., Davidson, A., Ghodsi, A., Hong, S. A., Konwinski, A., Murching, S., Nykodym, T., Ogilvie, P., and Parkhe, M.: Accelerating the machine learning lifecycle with MLflow, IEEE Data Eng. Bull., 41, 39–45, 2018.
Short summary
The field of artificial intelligence has shown promising results in a wide variety of fields including hydrological modeling. However, developing and testing hydrological models with artificial intelligence techniques require expertise from diverse fields. In this study, we developed an open-source framework based upon the python programming language to simplify the process of the development of hydrological models of time series data using machine learning.
The field of artificial intelligence has shown promising results in a wide variety of fields...