Received: 18 Dec 2020 – Accepted for review: 22 Jan 2021 – Discussion started: 27 Jan 2021
Abstract. High-fidelity and large-scale hydrological models are increasingly used to investigate the impacts of human activities and climate change on water availability and quality. However, the detailed representations of real-world systems and processes contained in these models inevitably lead to prohibitively high execution times, ranging from minutes to days. This becomes computationally prohibitive or even infeasible when large iterative model simulations are involved. In this study, we propose a generic two-layer model parallelization scheme to reduce the run time of computationally expensive model applications through a combination of model spatial decomposition and the graph-parallel Pregel algorithm. Taking the Soil and Water Assessment Tool (SWAT) as an example, we implemented a generic tool named GP-SWAT, enabling model-level and subbasin-level model parallelization on a Spark computer cluster. We then evaluated GP-SWAT in two sets of experiments to demonstrate the potential of GP-SWAT to accelerate single and iterative model simulations and to run in different environments. In each test set, Spark-SWAT was applied for the parallel simulation of eight synthetic hydrological models with different input/output (I/O) burdens and river network characteristics. The experimental results indicate that GP-SWAT can effectively solve high-computational-demand problems of the SWAT model. In addition, as a scalable and flexible tool, it can be run in diverse environments, from a commodity computer running the Microsoft Windows operating system to a Spark cluster consisting of a large number of computational nodes. Moreover, it is possible to apply this generic scheme to other subbasin-based hydrological models or even acyclic models in other domains to alleviate input/output (I/O) demands and optimize model computational performance.
GP-SWAT is a two-layer model parallelization tool for SWAT model based on the graph-parallel Pregel algorithm. It can be employed to perform both individual and iterative model parallelization, endowing it with a range of possible applications and great flexibility in maximizing performance. As a flexible and scalable tool, it can run in diverse environments, ranging from a commodity computer with a Microsoft Windows, Mac, or Linux OS to a Spark cluster consisting of a large number of nodes.
GP-SWAT is a two-layer model parallelization tool for SWAT model based on the graph-parallel...