GLOBGM v1.0: a parallel implementation of a 30 arcsec PCR-GLOBWB-MODFLOW global-scale groundwater model
Abstract. We discuss the various performance aspects of parallelizing our global-scale groundwater model at 30ʺ resolution (30 arcseconds; ~1 km at the equator) on large distributed memory parallel clusters. This model, here referred to as the GLOBGM, is the successor of our 5ʹ (5 arcminutes; ~10 km at the equator) PCR-GLOBWB 2 groundwater model based on MODFLOW having two model layers. The current version of the GLOBGM (v1.0) used in this study also has two model layers, is uncalibrated, and uses available 30ʺ PCR-GLOBWB data. Increasing the model resolution from 5ʹ to 30ʺ gives challenges for dealing with the increasing runtime, memory usage and data storage, going beyond the possibilities of a single computer. We show that our parallelization tackles these problems with relatively low parallel hardware requirement to meet average users/modelers who do not have exclusive access to hundreds or thousands of nodes within a supercomputer.
For our simulation we use unstructured grids and a prototype version of MODFLOW 6 that we have parallelized using the message passing interface. We construct an unstructured grid having a total of 278 million active cells to cancel all redundant sea and land cells, while satisfying all necessary boundary conditions, and distribute independent (sub)grids over three continental-scale models (Afro-Eurasia; 168 M, Americas; 77 M, and Australia; 16 M) and one remainder model for the smaller islands (17 M). Each of the four groundwater models is partitioned into multiple non-overlapping submodels that are tightly coupled within the MODFLOW linear solver, where each submodel is uniquely assigned to one processor core and associated submodel data is written in parallel during the pre-processing using data tiles. For balancing the parallel workload in advance, we apply the widely used METIS graph partitioner in two ways: straightforwardly applied to all model grid cells, and area-based applied to HydroBASINS catchments that are assigned to submodels for pre-sorting to a future coupling with surface water. We consider an experiment for simulating 1958–2015 with daily timesteps and monthly input, including a 20-year spin-up, on the Dutch national supercomputer Snellius. Given that the serial simulation would require ~4.5 months of runtime, we set a hypothetical target of a maximum of 16 hours of simulation runtime. We show that 12 nodes (32 cores per node, 384 cores in total) are sufficient to achieve this target, resulting in a speed-up of 138 for the largest Afro-Eurasia model using 7 nodes (224 cores) in parallel.
A limited evaluation of the model output using NWIS head observations for the contiguous United States was conducted, showing that increasing the resolution from 5ʹ to 30ʺ results in a significant improvement with GLOBGM for the steady-state simulation compared to the 5ʹ PCR-GLOBWB groundwater model. However, results for the transient simulation are quite similar and there is much room for improvement. For next versions of the GLOBGM further improvements require a more detailed hydrogeological schematization and better information on the locations, depths and regime of abstraction wells.
Jarno Verkaik et al.
Status: final response (author comments only)
RC1: 'Comment on gmd-2022-226', Anonymous Referee #1, 01 Apr 2023
- AC1: 'Reply on RC1', Jarno Verkaik, 07 Apr 2023
RC2: 'Comment on gmd-2022-226', Anonymous Referee #2, 08 Apr 2023
- AC2: 'Reply on RC2', Jarno Verkaik, 14 Apr 2023
- AC3: 'Comment on gmd-2022-226', Jarno Verkaik, 18 Apr 2023
Jarno Verkaik et al.
Jarno Verkaik et al.
Viewed (geographical distribution)
The paper is well written and interesting from a computational perspective. I do not see any problems with the technical steps. I am nevertheless not supportive of the paper and the underlying philosophy. These papers do great harm to the credibility of our community as they suggest that useful results can be extracted with the current state of global scale models which is not the case. While the authors do mention that there is a lot of room for improvement and a few ways to advance the model are proposed, the uncertainties of the model are so large that none of these results can be used for exploring hydrological processes or water resources management, no matter what scale you look at. As such I believe the authors should be clear on what the goal of the model actually is. With the current capacities of the model it can only serve as a test-case of the parallel implementation to explore runtimes or the efficiency of parallelization. It should be obvious to every reader that the results cannot and should not be used in any other way. This notion is not coming out at all, and therefore I believe the paper is implicitly but greatly overselling the results.
Below I list some of the reasons why the robustness of the model is so poor that the results are not useful for any kind of application. For example, the steady state comparison with Fan, CMG GGM in Figure 12: The comparison shows that all these results are inconsistent. The authors mention an improvement to GGM, but all of these products are highly uncertain themselves. Even though it is a lot of work, it is essential to compare the simulations to real data- Such data are available, see for example https://www.nature.com/articles/s41586-021-03311-x . I can see in page 14 section 2.4.2 that somehow measured heads are used for the transient simulations. But the section 426-430 is very hard to read. It is a very long list how the authors worked around data gaps, with no explanation why they do so, and there is no assessment to what extent these chosen steps are reliable. To be scientifically sound, every step has to be explained and demonstrated how, where and when the assumptions hold up. It is crucial and a basic scientific principle to root any model in reality. Model intercomparison is interesting at most, but insufficient. For continents where no transient head data are available a comparison with changes of water table obtained through GRACE could be informative in the context of figure 14. Such a comparison will clearly show that the model cannot reproduce the decline of the water table on a global scale.
The authors acknowledge many areas of improvement but, to highlight this point again, fall short in clearly declaring that none of the results should be used in any other context than in a software development framework. The list of issues is incomplete at best, here some examples to add: The only reliable data source is topography, but through the rough discretization a lot of information is lost, especially in steep terrains. The global geological products are speculative at best, a significant source of uncertainty. There are countless conceptual problems. Groundwater abstraction or rivers cannot be reliably simulated with these spatial resolutions in the MODFLOW conceptualization. The wells and rivers are cell- centered, making a robust simulation of drawdown cones or mounds extremely uncertain. The associated temporal dynamics of the water table decline cannot be captured with these resolutions. As the authors are presenting a transient model this is a fundamental problem. Also, a large part of the word is karstic, which is not reflected in the global geological model and conceptualization of MODFLOW. These areas should be fully blanked out, global maps of karst are available. Note that fractured systems are also treated the same as porous aquifers, there is no mention of the conceptual incompatibility of this approach. Moreover, all areas with permafrost or snow cannot be simulated either with the current conceptualization and should be blanked out as well. The list could be endlessly expanded, andthis assessment should have been done before the submission of the paper.
We can see that these issues fully undermine the robustness of the model by simply looking at some of the results. The model only seems to report a groundwater decline. This is not always the case. Another obvious indication that the model cannot produce results which are even remotely similar to reality is that in Figure 14a there are no hydraulic heads above the surface. But this is the case for many confined aquifers, take the great artesian basin in Australia or the Nubian systems for example. The large depth to groundwater in all mountainous regions further show that the model results have nothing to do with reality. You can easily see this by consulting the measured hydraulic heads as I suggested above, or by taking a healthy hike in the mountains and appreciating countless small rivers and streams emerging at high altitudes. Again, this list could be endlessly expanded.
Little is said about the water balances of the catchments, which is one of the basic requirements of assessing a models performance.
Given that there is no solid verification of the results and no clear statement that model cannot be used for any kind of application I do not support the publication of this work. To make it publishable the following steps should be done: