GLOBGM v1.0: a parallel implementation of a 30 arcsec PCR-GLOBWB-MODFLOW global-scale groundwater model
Abstract. We discuss the various performance aspects of parallelizing our global-scale groundwater model at 30ʺ resolution (30 arcseconds; ~1 km at the equator) on large distributed memory parallel clusters. This model, here referred to as the GLOBGM, is the successor of our 5ʹ (5 arcminutes; ~10 km at the equator) PCR-GLOBWB 2 groundwater model based on MODFLOW having two model layers. The current version of the GLOBGM (v1.0) used in this study also has two model layers, is uncalibrated, and uses available 30ʺ PCR-GLOBWB data. Increasing the model resolution from 5ʹ to 30ʺ gives challenges for dealing with the increasing runtime, memory usage and data storage, going beyond the possibilities of a single computer. We show that our parallelization tackles these problems with relatively low parallel hardware requirement to meet average users/modelers who do not have exclusive access to hundreds or thousands of nodes within a supercomputer.
For our simulation we use unstructured grids and a prototype version of MODFLOW 6 that we have parallelized using the message passing interface. We construct an unstructured grid having a total of 278 million active cells to cancel all redundant sea and land cells, while satisfying all necessary boundary conditions, and distribute independent (sub)grids over three continental-scale models (Afro-Eurasia; 168 M, Americas; 77 M, and Australia; 16 M) and one remainder model for the smaller islands (17 M). Each of the four groundwater models is partitioned into multiple non-overlapping submodels that are tightly coupled within the MODFLOW linear solver, where each submodel is uniquely assigned to one processor core and associated submodel data is written in parallel during the pre-processing using data tiles. For balancing the parallel workload in advance, we apply the widely used METIS graph partitioner in two ways: straightforwardly applied to all model grid cells, and area-based applied to HydroBASINS catchments that are assigned to submodels for pre-sorting to a future coupling with surface water. We consider an experiment for simulating 1958–2015 with daily timesteps and monthly input, including a 20-year spin-up, on the Dutch national supercomputer Snellius. Given that the serial simulation would require ~4.5 months of runtime, we set a hypothetical target of a maximum of 16 hours of simulation runtime. We show that 12 nodes (32 cores per node, 384 cores in total) are sufficient to achieve this target, resulting in a speed-up of 138 for the largest Afro-Eurasia model using 7 nodes (224 cores) in parallel.
A limited evaluation of the model output using NWIS head observations for the contiguous United States was conducted, showing that increasing the resolution from 5ʹ to 30ʺ results in a significant improvement with GLOBGM for the steady-state simulation compared to the 5ʹ PCR-GLOBWB groundwater model. However, results for the transient simulation are quite similar and there is much room for improvement. For next versions of the GLOBGM further improvements require a more detailed hydrogeological schematization and better information on the locations, depths and regime of abstraction wells.