Submitted as: model description paper
02 Aug 2022
Submitted as: model description paper | 02 Aug 2022
Status: this preprint is currently under review for the journal GMD.

Prediction of algal blooms via data-driven machine learning models: An evaluation using data from a well monitored mesotrophic lake

Shuqi Lin1, Don Pierson1, and Jorrit Mesman1,2 Shuqi Lin et al.
  • 1Erken Laboratory and Limnology Department, Uppsala University, Uppsala, Sweden
  • 2Département F.-A. Forel des sciences de l’environnement et de l’eau, Université de Genève, Genève, Switzerland

Abstract. With the increasing lake monitoring data, data-driven machine learning (ML) models might be able to capture the complex algal bloom dynamics that cannot be completely described in process-based (PB) models. We applied two ML models, Gradient Boost Regressor (GBR) and Long Short-Term Memory (LSTM) network, to predict algal blooms and seasonal changes in algal chlorophyll concentrations (Chl) in a mesotrophic lake. Three predictive workflows were tested, one based solely on available measurements, and the others applying a two-step approach, first estimating lake nutrients that have limited observations, and then predicting Chl using observed and pre-generated environmental factors. The third workflow was developed by using hydrodynamic data derived from a PB model as additional training features in the two-step ML approach. The performance of the ML models was superior to a PB model in predicting nutrients and Chl. The hybrid model further improved the prediction of the timing and magnitude of algal blooms. A data sparsity test based on shuffling the order of training and testing years showed the accuracy of ML models decreased with increasing sample interval, and model performance varied with training/testing year combinations.

Shuqi Lin et al.

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CEC1: 'Comment on gmd-2022-174', Juan Antonio Añel, 24 Aug 2022
    • AC1: 'Reply on CEC1', Shuqi Lin, 11 Oct 2022
  • RC1: 'Comment on gmd-2022-174', Anonymous Referee #1, 27 Aug 2022
    • AC2: 'Reply on RC1', Shuqi Lin, 11 Oct 2022
  • RC2: 'Comment on gmd-2022-174', Anonymous Referee #2, 06 Sep 2022
    • AC3: 'Reply on RC2', Shuqi Lin, 11 Oct 2022

Shuqi Lin et al.


Total article views: 419 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
301 100 18 419 28 3 3
  • HTML: 301
  • PDF: 100
  • XML: 18
  • Total: 419
  • Supplement: 28
  • BibTeX: 3
  • EndNote: 3
Views and downloads (calculated since 02 Aug 2022)
Cumulative views and downloads (calculated since 02 Aug 2022)

Viewed (geographical distribution)

Total article views: 391 (including HTML, PDF, and XML) Thereof 391 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 29 Nov 2022
Short summary
The risks brought by the proliferation of algal blooms motivates the improvement of bloom forecasting tools, but algal blooms are complexly controlled and difficult to predict. Given rapid growth of monitoring data and advances in computation, machine learning offers an alternative prediction methodology. This study tested various machine learning workflows in a dimictic mesotrophic lake and gave promising predictions of the seasonal variations and the timing of algal blooms.