GSFLOW – GRASS v 1 . 0 . 0 : GIS-enabled hydrologic modeling of coupled groundwater – surface-water systems

The importance of water moving between the atmosphere and aquifers has led to efforts to develop and maintain coupled models of surface water and groundwater. However, developing inputs to these models is usually time-consuming and requires extensive knowledge of software engineering, often prohibiting their use by many researchers and water managers, and thus reducing these models’ potential to promote science-driven decision-making in an era of global change and increasing water5 resource stress. In response to this need, we have developed GSFLOW–GRASS, a bundled set of open-source tools that develops inputs for, executes, and graphically displays the results of GSFLOW, the U.S. Geological Survey’s coupled groundwater and surface-water flow model. In order to create a robust tool that can be widely implemented over diverse hydro(geo)logic settings, we built a series of GRASS GIS extensions that automatically discretizes a topological surface-water flow network that is linked with an underlying gridded groundwater domain. As inputs, GSFLOW–GRASS requires at a minimum a digital 10 elevation model, a precipitation and temperature record, and estimates of channel parameters and hydraulic conductivity. We demonstrate the broad applicability of the toolbox by successfully testing it in environments with varying degrees of drainage integration, landscape relief, and grid resolution, as well as the presence of irregular coastal boundaries. These examples also show how GSFLOW–GRASS can be implemented to examine the role of groundwater–surface-water interactions in a diverse range of water resources and land management applications. 15


Introduction
Predicting and understanding the hydrologic impacts of climate, land use, and other natural and anthropogenic change is a scientific endeavor that is increasingly necessary to manage water resources.Addressing this need requires streamlined access to models that integrate surface and subsurface processes across a watershed.This integrated approach is required because traditional hydrologic models that focus only on a single component within a watershed cannot properly predict the effects of changing conditions and feedbacks across their boundaries.The widespread use of integrated models is stymied, however, by labor-intensive requirements for creating consistent sets of extensive model inputs, including the challenges of generating computationally robust surface and subsurface model domains.
Driven by the growing recognition of tightly coupled groundwater and surface-water dynamics and the need to evaluate and manage the two as a single resource (Winter et al., 1998), the U.S. Geological Survey (USGS) developed and released GSFLOW.This integrated hydrologic model couples the groundwater flow model MOD-FLOW with the rainfall-runoff model PRMS (Precipitation Runoff Modeling System) (Markstrom et al., 2008).Both Published by Copernicus Publications on behalf of the European Geosciences Union.
The process of implementing GSFLOW includes many hurdles that require significant time and computational knowledge to overcome.GSFLOW is not "fully integrated" in the sense that it does not simultaneously solve surface and subsurface flow equations; instead it consists of an iterative coupling between MODFLOW and PRMS that requires nearly all the individual input files for each of the two original models as well as an additional GSFLOWspecific linkage file.While a fully integrated model may have all the input information streamlined into a small number of internally consistent and efficiently organized files, to run GSFLOW, the user bears the burden of generating a multitude of diversely formatted ASCII files and ensuring that they contain inputs that correctly correspond with each other and can produce convergent coupled simulations.Freely available USGS GUIs -ModelMuse (Winston, 2009) and the PRMS GUI (Markstrom et al., 2015) -and proprietary GUIs (mostly for MODFLOW) can help users separately develop inputs to the two individual base models, but they do not offer support for creating the GSFLOW linkage file.The company Earthfx (http://www.earthfx.com/lastaccess: 22 November 2018) provides full GSFLOW support as part of their VIEWLOG package designed for the environmental consulting industry.More openly accessible software endeavors have also improved the usability of integrated hydrologic models (Bhatt et al., 2014;Tian et al., 2016;Gardner et al., 2018), including the USGS's new input data-processing tool for GSFLOW (Gardner et al., 2018), but the community still lacks a free and complete package spanning preprocessing to post-processing for heterogeneous surface and subsurface domains.This gap motivates our present work, which we anticipate will enable more widespread hydrologic modeling.
Our overarching goal is to develop a bundled package -GSFLOW-GRASS -to handle the complexity of the coupled GSFLOW model, thus tackling the grand challenge of accessibility plaguing many integrated modeling systems.We develop an integrated toolbox featuring fully automated, robust, and open-source codes that cover the entire model implementation process within a consistent and efficient framework, from building topologically linked hydrologic domains and assembling model input parameters to visualizing model outputs.Our use of only free and open-source programming languages and software is a key feature of the toolbox's accessibility.Python scripts generate model input files and model output graphics, and extensions using the open-source GRASS GIS platform build topographically defined sub-watersheds linked to subsurface grid cells.Opensource software facilitates the implementation of GSFLOW-GRASS by diverse academic, government, and individual entities, enables further community development of GSFLOW-GRASS, and aligns with the USGS's goal to make its resources publicly accessible.
Developing a fully automated toolbox that can be readily executed for diverse physical settings raises the key technical obstacle of how to robustly build stream networks and subbasins linked to subsurface computational domains without labor-intensive user intervention.Whereas overland flow routing and the calculation of drainage basins from topography are standard GIS capabilities, our tool improves upon these by automatically building topologically structured vectorized drainage networks without manual corrections using a least-cost path approach (Metz et al., 2011), while also including information on adjacency and routing pathways through the network that is required by integrated hydrologic models.The main technical advancement of GSFLOW-GRASS is the development of streamlined GRASS GIS extensions that have passed a diverse range of stress tests, including steep to low-relief topographies, large and intricate to small and simple drainage systems, incomplete to full topographic drainage integration, and inland to coastal watersheds.These new capabilities enable rapid, automated delineation of surface-water drainage networks linked to subsurface domains across any generalized landscape and computationally feasible resolution within the range of scales permissible by GSFLOW.By doing this all within a framework that also includes open-source model input and post-processing tools, GSFLOW-GRASS presents a solution toward more accessible integrated hydrologic modeling.
The release of GSFLOW-GRASS coincides with the USGS's new development of GSFLOW-Arcpy, an input data-processing tool similarly aimed at facilitating the use of GSFLOW (Gardner et al., 2018).The two software packages solve the problem of generating linked surfacewater and groundwater model domains using complementary technical formulations and approaches.The major distinction is GSFLOW-GRASS's use of a topographically determined surface-water discretization, while GSFLOW-Arcpy employs a regular rectangular domain.Each domain type has its strengths: regular grids allow for a flexible representation of spatially heterogeneous properties, while topographically based units can efficiently cover steep and complex terrain.The availability of these two packages will facilitate rigorous testing of the comparative merits of the different domain types, which remains an open question in hydrologic modeling (see model intercomparison tests that include representatives of each; e.g., Reed et al., 2004;Maxwell et al., 2014).Together, these software toolboxes can also reach more users with different needs and resources.GSFLOW-Arcpy's implementation is well suited for ArcGIS users who can separately leverage existing MODFLOW tools and are primarily seeking to create surface-water input components for GSFLOW.GSFLOW-GRASS aims to provide a fully automated toolbox that can be accessible to new hydrologic modelers who lack access to commercial ArcGIS software and would benefit from a complete suite of preprocessing and post-processing tools to execute GSFLOW.Offering different model implementation options affords greater flexibility depending on a user's software preferences, data formats, and application requirements, which we anticipate will help grow a diverse community of integrated hydrologic model users.

GSFLOW
GSFLOW simulates spatially distributed surface to subsurface water flow in a watershed using modified model codes from PRMS and MODFLOW.It is designed for simulations of watersheds with areas of a few square kilometers to several thousand square kilometers (Markstrom et al., 2008, p. 2).Although GSFLOW can run in modes equivalent to the stand-alone PRMS-IV model and the stand-alone MOD-FLOW model, only the "integrated" version is described here.Near-surface watershed processes within the shallow soil zone, including evapotranspiration, infiltration, runoff, and interflow, are represented by the PRMS subcomponent of GSFLOW.Groundwater flow below the "soil zone", including vertical soil water movement in the deeper unsaturated zone and saturated flow through horizontal aquifer layers, is represented by the MODFLOW subcomponent.Streamflow and exchange between streams and underlying groundwater systems are also represented by the MODFLOW subcomponent.We describe here the key features of GSFLOW in order to guide new users in implementing it and interpreting its results; Markstrom et al. (2008) document the full details of the model.

Domain discretization
GSFLOW adopts a hybrid spatial domain discretization approach (Fig. 1) to establish its computational units.Stream segments are links in a river network that are used in both the PRMS and MODFLOW subcomponents of GS-FLOW (Fig. 1a).Horizontally, the PRMS subcomponent uses hydrological response units (HRUs) of any shape as its fundamental discretized unit (Fig. 1b).These are used for calculations of the upper soil zone and the part of the surface not covered by the stream network.The MOD-FLOW subcomponent uses rectangular grid cells for the deeper subsurface (Fig. 1c) and to further discretize the stream network into reaches (Fig. 1d).Establishing reaches as the fundamental unit of computation for the stream network instead of segments makes it possible to resolve fine-spatial-resolution groundwater-surface exchanges.Like (a) Each segment is one link in the network.At each node, two tributary segments combine to flow into a single segment.Each is numbered.They need not be in any particular order, as indicated, but a downstream-increasing numbering scheme is required for updated inflows to all segments to be computed during the same iteration.(b) Flow in each of the subbasin HRUs is routed directly to a corresponding stream segment.The arrow on the upper left indicates that flow from outside of the representative tributary junction may also be part of the drainage network.Our topological approach to defining HRUs allows HRUs to be numbered the same as the stream segments that they enclose.Our code is written in such a way that future developments can relax this symmetry.(c) MOD-FLOW operates on a grid that underlies the PRMS-based stream network and HRUs; each cell has a unique ID that is sequentially numbered.(d) "Gravity reservoirs" are defined by the intersection of the PRMS HRUs and the MODFLOW grid."Reaches" are defined as the section of each PRMS stream segment that lies within a single MODFLOW grid cell and are numbered sequentially downstream as shown.
MODFLOW grid cells, HRUs can be set to rectangles (Gardner et al., 2018), but they are also commonly defined topologically to correspond to subbasins, as they are in our approach (Fig. 1).In general, gridded domains are easier to construct and extend readily to parallelized computational systems, and they allow flexible spatial specification of soil and land cover heterogeneity.In contrast, ungridded domains, such as the triangulated irregular networks (TINs) used in models including tRIBS (Vivoni et al., 2004) and PIHM (Qu and Duffy, 2007), can conform more efficiently to complex terrain.In the case of PIHM (Qu and Duffy, 2007), TINs were also implemented for better water balance performance through the mass-conserving finite-volume method (LeVeque, 2002); further, nested TINs can provide efficient solutions when higher resolution is desired for certain target areas (Wang et al., 2018).Other hydrological models with un-

Preferential flow reservoir
Preferential flow threshold Saturation threshold

Field capacity threshold
Wilting threshold

Capillary reservoir To MODFLOW
. (Adapted from Markstrom et al., 2008, Fig. 12.) gridded domains use topographically defined subbasins as efficient computational units, including SWAT (Arnold and Fohrer, 2005), SAC-SMA (Ajami et al., 2004), HEC-HMS (Feldman, 2000), and TOPNET (Bandaragoda et al., 2004).Vertically, the PRMS subcomponent of GSFLOW is discretized into conceptual shallow soil zone reservoirs, which do not correspond directly to physical locations within the soil column but are instead based on user-specified conceptual thresholds.Specifically, within an HRU, the soil zone is subdivided into three reservoir types -the capillary reservoir, gravity reservoir, and preferential flow reservoir, which are filled in order of increasing water storage using efficient water-accounting calculations (Sect.2.1.2) (Fig. 2).Underlying the PRMS soil zone are MODFLOW grid cells representing the deeper unsaturated zone and the saturated zone.While grid cells have uniform horizontal discretization, vertical layer thicknesses can be variable in order to accommodate different hydrostratigraphy.To link the PRMS and MOD-FLOW grids, the user must define gravity reservoirs at each different intersection of an HRU and a grid cell (Fig. 1d).The MODFLOW component of GSFLOW also relies on a userspecified stream network; stream segments represent tributaries, and the intersection of a stream segment with MOD-FLOW grid cells defines stream reaches (Fig. 1a, d).
GSFLOW uses a daily computational time step for both the PRMS component and MODFLOW component.Flows are exchanged between each component at each time step.Multiple MODFLOW "stress periods" can be invoked to represent different subsurface boundary conditions within a simulation period, but their lengths must be integer days.

Process description
This section includes a brief description of the main hydrologic processes represented in GSFLOW, with select parameters listed in Table 1.Full details can be found in the GSFLOW manual (Markstrom et al., 2008).In particular, Table 1 from Markstrom et al. (2008) summarizes all the surface-water processes captured by PRMS modules, groundwater processes captured by MODFLOW stress packages, and model coupling procedures captured by GSFLOW.
The PRMS component of GSFLOW includes modules that can convert commonly available climate data into complete forcing inputs needed for model simulations.These include methods for determining potential solar radiation, potential evapotranspiration, and snow accumulation or depletion; they also include different algorithms for spatially distributing data from one or a few observations points over the entire watershed.
For unsaturated zone flow, PRMS does not implement the Richards equation but instead applies computationally fast soil-water routing calculations to determine inputs and outputs for each HRU as well as exchanges among the three conceptual reservoir types within an HRU (GSFLOW manual, Fig. 19, Table 9).The "capillary zone" reservoir represents water held by capillary forces; it receives water through infiltration (based on parameter pref_flow_den) and loses water through evaporation and transpiration (based on parameters soil_moist_max, soil_rechr_max, and soil_type).After reaching field capacity (parameter soil_moist_max), water transfers from the capillary zone to "gravity reservoirs", where water can flow horizontally as slow interflow (based on parameters slowcoef_lin and slowcoef_sq) or drain vertically into the deeper subsurface domain that is handled by MODFLOW (based on parameters ssr2gw_rate, ssr2gw_exp, and ssrmax_coef).Gravity reservoirs can also receive groundwater discharge from the MODFLOW component when hydraulic head values exceed the lower limit of the soil zone.A fraction of gravity reservoir storage moves to the "preferential flow reservoir" (based on parameters pref_flow_den and sat_threshold), where fast interflow occurs (based on parameters fastcoef_lin and fastcoef_sq).If the preferential flow reservoir becomes full (based on parameter sat_threshold), then water exits the soil zone as Dunnian (saturation excess) runoff.Hortonian (infiltration excess) runoff calculations apply for impervious fractions of HRUs (set by parameter hru_percent_imperv).Surface runoff and interflow are routed between HRUs using a cascading flow scheme that follows user-specified indexing of linked HRUs and eventually reaches the stream network.
The MODFLOW component of GSFLOW computes water flow in the deeper unsaturated zone (UZF stress package), streams (SFR package), and saturated groundwater units (BCF, LPF, or UPW flow packages).Unsaturated zone flow is calculated using a kinematic wave approach, which assumes that capillary (pressure gradient) flow is negligi- Streamflow, as calculated by the MODFLOW component, includes inputs from upstream reaches, surface runoff and interflow from the PRMS component, base flow from the saturated zone discharge, and flows from possible underlying unsaturated areas.Outputs include flow to downstream reaches, leakage to groundwater, and flows to possible underlying unsaturated areas.Discharge across the streambed follows Darcy's law with specified streambed hydraulic properties.Five different options exist for stream discharge and head computations (parameter ICALC).The user can specify stream depths for each reach, apply Manning's equation to an assumed wide rectangular channel, apply Manning's equation for an eight-point-based channel and floodplain geometry, apply at-a-station power-law relationships between discharge, flow width, and flow depth (Leopold and Maddock, 1953), or specify an input lookup table of hydraulic geometries for each segment.Streamflow can be simulated as either steady-state flow (parameter IRTFLG = 0), whereby outflow to the next stream reach balances inputs, or as transient flow (parameter IRTFLG > 0) using a kinematic wave formulation for surface-water routing in channels, which applies the assumption that the water surface slope approximates the friction slope and therefore negates backwater effects.Some modifications were made to the original stand-alone PRMS and MODFLOW codes for their use in GSFLOW.Notably, the soil zone structure of PRMS was significantly altered to facilitate its linkage with a MODFLOW subsurface domain.Other modifications are noted in the GSFLOW manual (Markstrom et al., 2008, see sections on "Changes to PRMS" and "Changes to MODFLOW-2005").An additional feature starting in version 1.2.0 that is not described in the original manual is the inclusion of MODFLOW-NWT (Niswonger et al., 2011), a more numerically robust update to MODFLOW-2005(Harbaugh, 2005) for groundwater flow.

GRASS GIS
GRASS GIS is an open-source, multipurpose, and crossplatform geographic information system (Neteler and Mitasova, 2008;Neteler et al., 2008Neteler et al., , 2012) that supports utilities for efficient raster and vector computations (Shapiro and Westervelt, 1994;Mitasova et al., 1995;Šúri and Hofierka, 2004;Hofierka et al., 2009).It includes both graphical and command-line interfaces and may be driven by shell or Python scripts.It supports both 2-D and 3-D raster and vector data and includes SQL-based attribute table database management.GSFLOW-GRASS utilities are written for the most recent stable release version of GRASS GIS, v7.4.This supports Python scripting for both high-level built-in commands and for low-level access to database entries and vector geometries (Zambelli et al., 2013).We take advantage of these capabilities to develop an automated workflow to build GS-FLOW inputs through GRASS GIS.
We chose GRASS GIS as the interface to develop inputs because ( 1) it is open-source and cross-platform; (2) it enforces rigid vector topology, which is critical for building stream networks; (3) its broad library of built-in hydrologic tools include those for vectorized drainage network development with downstream-increasing indexing www.geosci-model-dev.net/11/4755/2018/Geosci.Model Dev., 11, 4755-4777, 2018 (Jasiewicz and Metz, 2011), which is essential for setting flow paths and adjacencies; (4) its generic Python scripting library and PyGRASS application programming interface (API) make it easy to develop new extensions; (5) these extensions may be added to the official subversion (svn) repository, from which they can be automatically downloaded and installed on user computers using the g.extension command; and ( 6) it provides a GUI and command-line interface (CLI) that are consistent with one another.The GUI and CLI interfaces are not required for GSFLOW-GRASS because the GRASS GIS component is handled mostly behind the scenes by a batch-processing Python script (buildDomainGRASS.py,Sect.3.2); however, they allow end users to rerun certain portions of the process and/or produce their own workflows using the GSFLOW-GRASS extensions as building blocks.The open-source aspect of the present work is in part motivated by the need for water assessment and planning tools in the developing world (Pal et al., 2007), and these extensions, combined with the interchangeable and consistent GUI and CLI, can help users to generate their own advanced customizations of GSFLOW-GRASS.

Methods
We adopt a heterogeneous surface and subsurface computational domain for GSFLOW-GRASS that employs subbasin surface HRUs that are linked to subsurface grid cells.In addition to the computational efficiency of discretizing complex terrain into subbasins with complex shapes rather than using a gridded surface domain at the resolution required to resolve the HRUs, the use of subbasin HRUs that route surface runoff directly to stream segments also eliminates the need for establishing a cascading network (Sect.2.1.2).Because of GSFLOW's conceptual (rather than gradient-based) surfacewater routing scheme, numerical differences between subbasin and gridded HRU's are difficult to predict, but the automated GSFLOW-GRASS toolbox can help enable future testing to rigorously interrogate their respective performances.GSFLOW-GRASS strikes a balance between generating a ready-to-go GSFLOW implementation and providing flexibility to customize applications.With a newly developed set of automated and robust GIS domain builder tools, GSFLOW-GRASS can be applied to any digital elevation model (DEM) to produce GSFLOW model simulations.Only a few steps are required to set up a GSFLOW model on the user's computer system.For further model tuning, all scripts in the toolbox are open-source and commented to allow changes to any parameter as well as the development of optional GSFLOW capabilities not included in the default GSFLOW-GRASS implementation.Many popular hydrologic model implementation programs have GUIs, including ModelMuse (Winston, 2009) Hydrogeologic Inc., 2011), Hydrus (Simunek et al., 2009), ArcSWAT (Neitsch et al., 2002), and MIKE-SHE (Butts and Graham, 2005).While these are easiest for novice model users, GUIs can be challenging to develop for cross-platform implementations and generally support less flexibility for customization.Thus, we chose a mostly command-line approach, which has been designed and tested for use on Linux and Windows operating systems.

User-specified settings and model inputs
To seamlessly unify the different GSFLOW-GRASS functionalities, including the automated GRASS GIS domain builder, GSFLOW input file builder, and visualization components (Fig. 3), users specify model inputs and configurations using a settings text file.All inputs from the settings file are read in and processed by the ReadSettings.pyscript.GSFLOW requires a daunting number of different model inputs (nearly 200 parameters for the PRMS subcomponent alone).For ease of use, only a handful of applicationspecific and commonly adjusted inputs may be assigned using the settings file, and default parameter values are applied elsewhere.While the default (and simplest) approach to GSFLOW-GRASS is to modify only the settings file, other parameters (including those mentioned in Sect.2.1.2) may be readily changed in the input file builder by searching for the parameter names defined in the GSFLOW manual and changing their values.The open-source nature of our toolbox also allows users to add parameters to the settings files for future extensions of GSFLOW-GRASS.Specifying and including spatially variable properties is a major challenge to distributed modeling.The settings file accommodates the use of variable aquifer hydraulic conductivity, channel width, and Manning's n parameters, which are described further in Sect.3.3.3.Universal solutions are beyond the scope of the default toolbox, but we do provide a generalized GRASS GIS extension called v.gsflow.mapdata to facilitate the generation of heterogeneous model inputs; v.gsflow.mapdata,further described in Sect.3.2.4,can take any spatially variable data in a raster or vector GIS format and map it to one of the GS-FLOW discretization structures: subbasin HRUs for PRMS surface-water processes, regular grid cells for MODFLOW groundwater processes, gravity reservoirs that link the HRUs and MODFLOW grid cells, or stream segments or reaches for MODFLOW streamflow processes.This allows users to add data from any source -e.g., meteorological forcing, soil properties, hydrogeologic stratigraphy, or vegetation and land cover -to the GSFLOW-GRASS data structures.Other software tools have facilitated hydrologic modeling by automating the connection with established databases (Viger and Leavesley, 2007;Leonard and Duffy, 2013;Bhatt et al., 2014;Gardner et al., 2018).The USGS's GIS Weasel tool (Viger and Leavesley, 2007) may be used for deriving PRMS parameters from physical data sets such as STATSGO, which can then be mapped to the appropriate GSFLOW data structure using v.gsflow.mapdata.The current GSFLOW-GRASS release aims to provide a general set of tools and does not directly link with any specific databases, which are typically only available in observation-rich regions and countries.
The settings file is divided into subsections, each of which drives a portion of the model setup and organization.The "paths" section defines the computer directory structure for the project and GSFLOW executable, as well as the project name and GSFLOW version.Three GRASS GIS sections, GRASS_core, GRASS_drainage, and GRASS_hydraulics, set the GIS location and path to the DEM, the surface and subsurface flow discretization parameters, and open-channel flow geometry and resistance, respectively.The "run_mode" section allows the user to execute GSFLOW in either spinup or restart mode (Regan et al., 2015).Spin-up simulations start with a preliminary MODFLOW steady-state execution using a specified infiltration rate (see below) to calculate reasonable initial groundwater head conditions for the subsequent transient simulation that includes both the surface and subsurface domains; the steady-state step can be essential for obtaining numerically convergent groundwater results and more realistic solutions for the entire coupled system.At the end of a spin-up run, final PRMS and MODFLOW state variables are saved in files that can be specified in the run_mode section to initiate restart coupled runs without the preliminary groundwater steady-state period.The "time" section is used to specify the temporal window of the simulation.The "climate inputs" section sets input parameters for the PRMS climate_hru option, which is the standard climate implementation supported by GSFLOW-GRASS (see Sect. 3.3.1) .Finally, the "hydrogeologic_inputs" section defines the preliminary steady-state MODFLOW infiltration rate, used for spin-up runs and either a layered or fully distributed subsurface hydraulic conductivity structure.The ReadSettings.pyscript uses these inputs to create a directory structure and organize all GIS and simulations files.This imposed directory structure supports easy exchange between the different tool kit modules and allows the use of relative directory names, which facilitates the sharing of model files across computer systems and between users.

GRASS GIS domain builder
A critical challenge for any distributed hydrologic model is the fully automated development of a reproducible, topologically correct, and interlinked data structure that describes water flow through a catchment in a computationally efficient manner.Semiautomated approaches to building surface flow networks are common (e.g., Luzio et al., 2006;Arnold et al., 2012), but the development of a fully automated approach has been impeded by the mathematical and logistical difficulties of building a topologically ideal drainage network (i.e., one whose fundamental unit is a tributary junction).Many standard GIS tools encounter problems when handling complex digital topography (represented using a DEM) that may contain natural or artificial depressions and whose grid cells are often much larger than real topographic features.Further complications arise when incorporating surface flow networks into integrated hydrologic models because each link within the network must then be tagged with sufficient information to identify drainage pathways through the whole network, and the stream network must also be linked with sometimes different geometries and resolutions for surface water and the groundwater flow grids.
We addressed this challenge by creating 11 new GRASS GIS "extensions", also called "add-ons", that work alongside core GRASS GIS commands to transform user inputs (including a single DEM) into a set of GSFLOW inputs.This workflow creates an automatically generated network of streams and HRUs that is spatially linked to a MODFLOW grid.The domain-building procedure is automated through the buildDomainGRASS.pyscript, which takes inputs from the settings text file, implements the domain-building workflow, and produces ASCII files used by GSFLOW-GRASS's Python input builder scripts.

Surface-water network
In the first step of the fully automated domain-building workflow, GRASS GIS imports a user-provided DEM to define the drainage network and HRUs.After hydrologically correcting the DEM by filling pits and removing cells that have flow inputs from outside the map area (GSFLOW-GRASS requires the full topographical catchment to be included in the model domain), a Hortonian drainage network is constructed using the r/v.stream.* tool kit (Jasiewicz and Metz, 2011) that relies on a single flow direction implementation of the r.watershed flow-routing algorithm (Metz et al., 2011).Subbasins associated with each stream segment are designated as HRUs in order to follow both the natural discretization of the landscape and the architecture of PRMS (Markstrom et al., 2015).River headwaters are defined based on a threshold drainage area that may be weighted by the user to represent, for example, nonuniform precipitation or snowmelt inputs.Such weights permit a more realistic representation of drainage density and, as a result, increased model resolution in areas that contribute more water to the catchment.
The GRASS GIS drainage network creation algorithm, r.watershed (Metz et al., 2011), is both efficient and accurate.For computations that can take place entirely within memory, its speed exceeds that of both Terraflow and the D8 routing used by ArcGIS (Jenson and Domingue, 1988;Maidment and Morehouse, 2002;Arge et al., 2003;Magalhães et al., 2014).This speed results from its sorting algorithm and priority queue, and a standard desktop workstation today can process DEMs in memory with tens of thousands of cells on each side.The least-cost path approach taken by r.watershed does not require any pitfilling step, but we do include it in order to create a more consistent DEM with downslope-routed flow for the remainder of the analysis.The flow-routing component of the more recent Fastscape algorithm by Braun and Willett (2013) could be faster than r.watershed, but these have not been benchmarked, and Fastscape is not yet integrated into the GRASS GIS toolchain, which is necessary for all of the subsequent steps.Kinner et al. (2005) demonstrated that r.watershed is more accurate than Terraflow (Arge et al., 2003), especially in low-relief areas and those in which tree canopy elevations are mistakenly interpreted as groundsurface elevations; this latter issue is pervasive across many digital elevation models, including the widely used Shuttle Radar Topography Mission (SRTM) DEM (Farr et al., 2007;Miliaresis and Delikaraoglou, 2009).
In spite of these advantages, r.watershed has not been used before to build flow networks for integrated hydrologic models.Other integrated hydrologic model domain-building tools use local drainage direction information (Bhatt et al., 2014;Maxwell et al., 2017;Gardner et al., 2018).While not an integrated hydrologic model due to its limited subsurface modeling capabilities, the Soil & Water Assessment Tool (SWAT) (Srinivasan and Arnold, 1994) incorporated GRASS GIS version 4 with an earlier and much slower version of r.watershed.Beyond this, r.watershed is typically discussed in the drainage algorithm literature (e.g., Barnes et al., 2014;Magalhães et al., 2014;Schwanghart and Scherler, 2014;Sangireddy et al., 2016), directly applied to flow-routing and cost-path calculations (e.g., Wickert, 2016;Bird et al., 2016), or included as a component of an assessment tool (e.g., Bhowmik et al., 2015;Rossi and Reichenbach, 2016).By integrating r.watershed into GSFLOW-GRASS via the r/v.stream.* tool kit for Hortonian drainage network analysis (Jasiewicz and Metz, 2011), we are able to harness the capabilities and efficiency of the hydrologic computational engine within GRASS GIS for integrated hydrological modeling.
Following drainage network construction, the next step in the automated workflow is to map the connections between each segment in the tributary network.To do this, we developed an extension called v.stream.network,which builds atop the upstream to downstream stream segment and HRU indexing in the existing r/v.stream.* tool kit (Jasiewicz and Metz, 2011).This index is a unique positive integer identifier applied to each segment and its overlapping HRU and is called a "category" in GRASS GIS.For each segment and overlapping HRU in the drainage network (Fig. 1a, b), v.stream.networkwrites the category value of the immediately downstream stream segment to the "tostream" column in its associated attribute table row.Any stream segment exiting the map area is given a "tostream" value of 0. This links the stream segments and HRUs in the watershed as a directed graph (e.g., Czuba and Foufoula-Georgiou, 2014;Heckmann et al., 2014;Tejedor et al., 2015).
At this point, the user may optionally break out of the automated workflow and edit the vector geometries that define the streams and subbasins.While we expect that many users will find the fully automated approach to be a major advantage over those that require manual intervention -these add a source of subjectivity and laborious processing time - Gardner et al. (2018) note that human-developed drainage structures may cause a discrepancy between topographically routed flow and actual flow paths that require manual adjustments.This manual adjustment is not standard and requires the addition of a break point in buildDomainGRASS.py,as well as for the user to manually adjust the category numbers (indices) and "tostream" network topology values in the attribute tables for the segments and HRUs if the changes are substantial enough to change the flow network.
After this, the study area is limited to a single drainage basin using the new v.stream.inbasinextension, completing the development of the drainage network geometry and topology.This step is included because the goal of many hydrologic studies is to understand a single watershed basin.If this is not the case, buildDomainGRASS.pymay be edited to skip this step and to analyze all complete drainage networks within the domain.
Each stream segment is then supplied with attribute values required for GSFLOW through the v.gsflow.segmentsextension.This numbers each segment for GSFLOW (Fig. 1a) and populates the associated database table with hydraulic geometry, channel roughness (constant or spatially distributed), and channel and floodplain width (constant or spatially distributed).Additional less commonly used options are also available, including additional input discharge for the upstream-most stream segments (e.g., from human intervention), input diffuse runoff, and direct precipitation on the stream.

Groundwater flow grid
Following the completion of the surface-water domain, the next step is to build the groundwater domain.MODFLOW-NWT uses a rectangular finite-difference grid structure (Harbaugh, 2005;Niswonger et al., 2011).The cell size for this grid is selected by the user in the settings file.It is often necessary to discretize the MODFLOW groundwater domain on a grid that is coarser than the DEM used for surface flow routing in order to increase computational efficiency while still allowing GSFLOW-GRASS to generate a complex surface-water network; the proper grid cell size depends on the size of the HRUs and the strength of the surfacewater-groundwater coupling; v.gsflow.gridbuilds the MODFLOW grid as a set of GIS vector areas (Fig. 1c) using the built-in v.mkgrid command.The resolution of this grid is approximately that desired by the user and meets two criteria designed to most accurately and efficiently capture the topographic information represented in the fine-scale flowrouting DEM.First, the grid size must allow for MODFLOW grid cell edges to align perfectly with flow-routing DEM cell edges.Second, the grid size must be able to generate a domain that fits exactly within a tight-fitting bounding box snapped to the closest flow-routing DEM cell edges outside of the study watershed.As a final step, this tight-fitted domain is padded by three cells on each side to ensure appropriate boundary condition handling.
The r.gsflow.hydrodemfunction then performs a hydrologically correct resampling of the original flowrouting DEM to the resolution of the MODFLOW grid cells.This resampling is required when users desire a MODFLOW grid that is coarser than the flow-routing DEM for computational feasibility.Without a hydrologically correct resampling, MODFLOW cells would be assigned the overall mean elevation from the corresponding region of the flowrouting DEM.In this case, MODFLOW grid cell elevations may average across valley floors and valley walls, creating a bumpy river longitudinal profile that contains artificial dams (Fig. 4a).With the hydrologic correction, MODFLOW cells that do not contain stream segments remain unchanged, but cells containing stream segments are assigned the mean el-evation of only the river channel cells in the flow-routing DEM.This enforces decreasing elevations down the drainage network, and Fig. 4b demonstrates the resulting continuous flow through the catchment.

Surface-water-groundwater coupling
The final step in developing the GSFLOW domain is to link the surface-water geospatial data structures (HRUs and segments) with the MODFLOW rectangular grid; v.gsflow.reachesand v.gsflow.gravresconstruct the reaches and gravity reservoirs (Sect.3.1), which are the intersection of segments and HRUs, respectively, with each MODFLOW grid cell (Fig. 1d).The database table for the reaches includes values for the thickness of the streambed sediment (defaults to 1 m) and its hydraulic conductivity (defaults to 5 m day −1 , characteristic of sand and gravel).

Accessing additional GSFLOW functionality
GSFLOW supports more input options than we have defined for our GRASS GIS v.gsflow.* commands, though we have included many of the most common options.These are sufficient to set up and run a GSFLOW simulation.However, they may not encompass all of the variables that some users may consider to be important.
Therefore, GSFLOW-GRASS includes the v.gsflow.mapdatatool for users to add other attributes to database tables, with a focus on spatial distributions.These attributes can include spatially variable precipitation and temperature, parameter choices for model spin-up, and fully distributed maps of hydraulic conductivity, specific yield, streambed hydraulic properties, soil texture, vegetation type, and evapotranspiration parameters (current input configurations described in Sect.3.3).The core capability of v.gsflow.mapdata is the use of averaging and nearestneighbor methods to connect input data from raster grids, vector areas (polygons), or vector points to the attribute tables of the HRUs, segments, gravity reservoirs, reaches, and/or MODFLOW grid cells.As these are custom additions, calls to v.gsflow.mapdatamust be added by end users to buildDomainGRASS.py.Once added, the end user can follow our template code in the input file builder to add these to the GSFLOW input files; v.gsflow.mapdatatherefore adds user-driven flexibility, whereby input data can be supported by GSFLOW-GRASS, and a starting point for users who may want to expand on its capabilities.

Geospatial data export
In the final step, the generated attributes and geometries are exported.This information is stored in GRASS GIS as raster grids and vector geometries associated with SQL database tables; buildDomainGRASS.pyexports a rasterized "basin mask" (1 in the basin, 0 outside) and the hydrologically corrected DEM at the MODFLOW grid resolution, as well as  vectorized GIS data (shapefile format) for the HRUs, gravity reservoirs, MODFLOW grid, stream segments, stream reaches, pour point, full study basin area, and downstream boundary condition cells.Then, v.gsflow.outputexports the database tables associated with the vectorized GIS data in comma-separated variable (CSV) files that can be read in by the input file builder scripts (Sect.3.3) for use in GSFLOW.These exported data are then ready to be parsed into GSFLOW inputs using the Python input file builder scripts (Sect.3.3) and/or to be used for data visualization (Sect.3.5).
This separation between the GIS and ASCII input file components is intentional.The GRASS domain builder component typically requires several minutes to run and often only needs to be executed once for a watershed.The ASCII files, on the other hand, can form an effective basis for ensembles of runs.These can be used to calibrate parameters or explore hydrologic sensitivity to variable forcing scenarios.

GSFLOW input file builder
GSFLOW-GRASS includes a set of input file builder scripts that are streamlined to incorporate the model domain discretization constructed by the GRASS GIS workflow and generate corresponding model inputs for the GSFLOW control file, PRMS-type input files, and MODFLOW-type input files.Most of the new features in GSFLOW that are not in stand-alone PRMS or MODFLOW follow the same modular modeling system input data file format (Leavesley et al., 1996) as PRMS, which includes the use of a control file as the main interface file, modules for different computational options, and the PRMS input file syntax.In contrast, MOD-FLOW uses a name file as its main interface file, imple-ments packages for computational options, and follows its own file syntax.The following builder scripts handle these different formats and are automatically executed through the tool kit's run file (Sect.3.4).The builder scripts may also be customized for extensions beyond the default implementation.

GSFLOW control file
The GSFLOW control file is the highest-level input file and is created by the printGSFLOWControlfile.py script in the GSFLOW-GRASS tool kit.The tool kit is streamlined for configuring the integrated mode of GSFLOW (set through the model_mode parameter).
Inputs for the control file parameters are organized under six numbered sections in printGSFLOWControlfile.py.The script sets parameters related to climate forcing, time domain, and run mode based on what the user specifies in the settings file; all other parameters are preset to default values.Further customization of control file parameters (stored in the list variable con_par_name) requires simply changing default values (in the corresponding list variable con_par_values) in the script; spatially variable entries can be generated with the aid of the v.gsflow.mapdatatool.The first two sections are required and include details about the simulation execution and module choices.The third section establishes spin-up versus restart run modes based on settings file entries.Sections 4 and 5 contain customizable lists of output variables to be printed, which can be used by visualization scripts in GSFLOW-GRASS (Sect.3.5).The last optional section is for running the model in a debugging mode.
Note that the default implementation of this tool kit uses the climate_hru module for precipitation and mini-mum and maximum daily temperature; this means that the model will employ preexisting files containing data already specified by HRU.The PRMS component of GSFLOW does include other modules for distributing data from one or a handful of weather stations, but these typically require application-specific empirical parameters that are difficult to incorporate in a generic tool kit.Use of the cli-mate_hru module provides flexibility for the user to implement their own spatial interpolation or extrapolation methods, which can then be transferred to the GSFLOW domain with the v.gsflow.mapdatatool.GSFLOW-GRASS's default implementation also uses the Priestley-Taylor formulation for potential evapotranspiration calculations (Markstrom et al., 2008).This module was chosen because of its reliance on only air temperature and solar radiation (calculated by the PRMS component of GSFLOW) and because of the relative ease of accounting for different vegetation properties through the parameter pt_alpha (in the PRMS parameter file, Sect.3.3.2).
After the six parameter input sections in printGSFLOWControlfile.py, the script builds the control file and then generates an executable file (shell script for Linux or batch file for Windows) for running GSFLOW with the control file.After all other input files are created, this executable is called by the tool kit's automated run file (Sect.3.4).The executable can also be used to run GSFLOW outside of the GSFLOW-GRASS tool kit.

PRMS-type input files
Input files required for the PRMS component of GSFLOW are the parameter file (param_file in the control file), which includes empirical surface and soil zone properties, and the data file (data_file in the control file), which includes climate observations for the spatial interpolation and extrapolation algorithms.If the climate_hru module is selected, as it is in the tool kit's default implementation (Sect.3.3.1),then individual input files with HRU-distributed climate variables must also be specified.For a quick setup of GSFLOW-GRASS, the script printClimatehru.pytakes daily observations from a single file and distributes them uniformly over all HRUs.The tool kit handles the minimum required climate variables -daily precipitation, maximum temperature, and minimum temperature -and it is set up to be readily extended to also include humidity, solar radiation, and/or wind speed.A spatially uniform approach may be acceptable when the size of a rainstorm is typically greater than the size of a catchment and climatic variables vary only weakly with slope and aspect.Larger and higher-relief catchments require spatially distributed climate inputs for realistic outputs; these require custom inputs from the end user, which can be ported from any discretization to the HRU domain with the aid of the v.gsflow.mapdatatool.
The parameter file is created by the script printPRMSparamfile.py.The script includes sections for domain dimensions and for parameter inputs, both of which are streamlined to take values parsed from the GRASS GIS domain builder outputs (as indicated in the comments in the script).Because of PRMS's conceptual soil moisture regimes, the parameter file requires a substantial number of parameter inputs related to the soil and vegetation that cannot easily be specified without calibration.As a default to help the user get GSFLOW up and running, most parameter values in printPRMSparamfile.pyare preset, mostly using calibrated values from the Sagehen watershed example that was distributed with the GSFLOW model version 1.2.1.We have indicated with the comment "# * * * CHANGE FOR SPECIFIC SITE" those parameters that could also be altered based on known characteristics of one's watershed site.This includes various soil and land cover inputs, such as soil_type (sand, loam, or clay), cov_type (bare soil, grasses, shrubs, or trees), transp_end (end month of transpiration, for phenology), and pt_alpha (Priestley-Taylor parameter α, which can be based on literature values).In addition to these highlighted parameters, users can review all parameters to determine whether others could be particularly important for their specific application.These may include some of the parameters mentioned in Sect.2.1.2that determine exchanges between different soil zone reservoirs.Spatially variable information can be transferred to the HRU domain using the v.gsflow.mapdatatool.Rigorous calibration of PRMS parameters can eventually be carried out with inverse codes such as PEST (Doherty, 1994) or UCODE (Poeter andHill, 1998, 1999).

MODFLOW-type input files
GSFLOW requires input files for each MODFLOW package utilized, which can include any of the packages listed in Table 1 of the GSFLOW manual (Markstrom et al., 2008, Appendix 1, 176-226, provides details).Our tool kit by default creates a relatively general MODFLOW setup, which includes required input files and omits most optional ones, such as the well package.The tool kit creates four basic package input files (name file, basic package file, discretization file, and the optional output control file for customizing output files), two different groundwater flow package options (the layer-property flow (LPF) from MODFLOW-2005 and the upstream weighting package (UPW) from MODFLOW-NWT), the numerical solver package (preconditioned conjugate gradient (PCG) for LPF and Newtonian (NWT) input file for UPW), the streamflow-routing package (SFR), and the unsaturated-zone flow package (UZF).
The script printMODFLOWInputs.pycreates a set of internally consistent input files that incorporate the domains constructed by the GRASS GIS workflow (Sect.Input files created outside of our tool kit for a stand-alone MODFLOW model implementation of identical discretization will for the most part be usable with the integrated GS-FLOW model.However, as indicated in Table 1 of the GS-FLOW manual, some MODFLOW packages were modified for their use in GSFLOW.Advantages of implementing our tool kit over using pre-created MODFLOW input files are that it already incorporates these GSFLOW modifications, it automatically uses the GRASS GIS builder results for the domain, and it guarantees a directory structure that is consistent with the rest of the input files and the visualization scripts. The GSFLOW-GRASS tool kit also offers an optional script createSpatialHydCond.py for generating spatially distributed hydraulic conductivity fields for the upper layer based on elevation and/or distance from the stream network, with the assumption that lower elevations and/or riparian corridors have higher hydraulic conductivity properties.Because application-specific entries cannot easily be generalized for input through the settings file, users should directly customize elevation and stream distance thresholds, as well as corresponding hydraulic conductivity values, at the top of the createSpatialHydCond.py script.This script will automatically import domain information from the settings file and export results to the file location specified by the settings file; createSpatialHydCond.py serves as a ready-to-go tool for creating physically plausible hydraulic conductivity patterns, and it provides an example for how users can create their own scripts to customize spatially distributed inputs.A similar type of script could create spatially distributed infiltration fields for the preliminary MODFLOW steady-state simulation in spin-up runs (e.g., finf entry in the settings file).These tools can provide preliminary inputs to jump-start GSFLOW model implementations.However, realistic construction of hydrogeologic frameworks relies on data from sources such as well logs, geologic maps, geophysical measurements, and pumping tests (Reilly, 2001;Reilly and Harbaugh, 2004).For these, we recommend that users import the appropriate data sources into GRASS GIS and use the v.gsflow.mapdataextension to map these parameters onto the appropriate GS-FLOW objects (e.g., HRUs, MODFLOW cells).Properties for stream segments and reaches -such as streambed hydraulic conductivity and unsaturated hydraulic properties below the streambed -are set to default values that can be changed through the GRASS GIS extensions.By default, the streamflow calculation is set to use Manning's equation by assuming a wide rectangular channel (ICALC = 1).Spatially variable stream widths and/or Manning's n values may be set through the settings file based on either gridded or pointbased (e.g., survey) data, and v.gsflow.segmentsalso supports the delineation of both channel and floodplain geometries and roughness parameters.

GSFLOW run file
For the user's convenience, the GSFLOW-GRASS tool kit includes an executable run file, which is a shell script for Linux, goGSFLOW.sh,and a batch file for Windows, goGSFLOW.bat.The run file collects input from a specified settings file and then runs all of the above input file builder scripts: the script runGSFLOW.py,which launches the GSFLOW simulation, and the runtime visualization script plotGSFLOWTimeSeries_Runtime.py,further described below.If the runtime visualization is not desired, the user can comment out the corresponding execution line in the run file.As long as the user does not wish to use more features than are exposed in the settings file, no direct interface with the code is required to run GSFLOW-GRASS.This permits a "quick-start" implementation of GSFLOW, which can substantially lower the barrier to entry for using this model.
The run file may be implemented only after the model domain is generated through buildDomainGRASS.py.The GSFLOW-GRASS tool kit separates the GRASS domain builder module from the run file because users will typically only need to construct their domain once, but will perform multiple runs of the model with variable parameter inputs, for example, for model calibration or to simulate different time periods.
After preliminary quick-start simulation tests, users can further customize their runs by taking advantage of the modular structure of the tool kit, which has a separate script for each input file.For example, to target specific aspects of the model, such as the surface runoff properties, corresponding parameters may be adjusted in the PRMS parameter file by editing and rerunning printPRMSparamfile.py.Select input file builder scripts can be run either within Python or by editing the executable run file.

Visualization tools
Our tool kit includes post-processing Python scripts that employ the Matplotlib plotting library (Hunter, 2007) for visualizing the domain discretization, key MODFLOW inputs, and model output results.The model discretization for the PRMS component of GSFLOW is exported from GRASS GIS as a set of standard vector GIS files (shapefiles).Our Python plotting scripts use these shapefiles to create figures of the surface HRU and stream segment discretization (plotBasin.py) and to generate movies of HRU-distributed and streamsegment-distributed variables (plotHRUvars.pyand plotSegmentDischarge.py).These output variables (e.g., evapotranspiration and streamflow) are set through an-iOutVar_names in the GSFLOW control file (see Section 3.3.1).The exported shapefiles may also be used to visualize model results with standard GIS packages (e.g., QGIS: QGIS Development Team, 2013) outside of GSFLOW-GRASS.
For the MODFLOW component of GSFLOW, the tool kit's script plotMODFLOW.pyplots spatially distributed layer elevations, hydraulic conductivity, and a map of active computational grid cells.The script also plots spatially distributed MODFLOW simulations results over time, including for hydraulic head, change in head, water table depth, and recharge from the unsaturated zone.For storage efficiency, the tool kit creates and reads in head and unsaturated zone output files in binary format.
For basin total GSFLOW results, the Python script plotGSFLOWTimeSeries.py generates time series lines for user-selected variables from the main GSFLOW CSV output file.The names of all variables, along with their descriptions and units, are listed in GSFLOWcsvTable.py,which is imported into plotGSFLOWTimeSeries.py to ensure consistency in figure labels and axes.Our toolbox also includes the runtime visualization script plotGSFLOWTimeSeries_Runtime.py that is by default called by the run file (but can be commented out if desired) and displays a continuously updated time series plot of basin total precipitation and discharge.Tracking simulation progress with runtime plots can be very useful for complex integrated models, which can have lengthy simulation times.
The visualization scripts can be run using a command-line parser and/or by editing plot options that appear near the top of each script.More advanced users may modify the bodies of the scripts to change features such as axis intervals or color schemes.For users who want to adjust the scripts, we suggest running them in the iPython interactive programming console (Pérez and Granger, 2007), which is also incorporated into the Spyder integrated development environment (IDE).Although this visualization approach requires some familiarity with Python and/or command-line argument parsing, it accommodates a wide range of plotting preferences.All plots and videos may be displayed as on-screen figures (in raster or vector formats using the interactive Matplotlib window) and may be saved as images (interactively) or videos ( * .mp4format) as defined by inputs to the plotting script.
Other existing no-fee USGS GUI programs for MOD-FLOW also provide visualization capabilities, and using these with the input and output files produced with GSFLOW-GRASS is possible.In particular, GW Chart (Winston, 2000) can be directly implemented for plotting basin-level time series results.Additionally, Model Viewer (Hsieh and Winston, 2002) and ModelMuse (Winston, 2009) are able to read in and plot spatially variable head results from binary files with the extension .bhd,but this does require manual post-processing steps.For Model Viewer, the user needs to copy all MODFLOW input and output files to a new folder inside the Model Viewer project directory and select the namefile when prompted.For ModelMuse, the user must first delete the line that starts with IWRT from the name file in order to load the project into the program.Once the project settings are loaded into ModelMuse, the user can use the "import model results" tool to select the binary head file.

Examples
Three example implementations for the study sites shown in Fig. 5 (characteristics listed in Table 2) demonstrate (1) the variety of hydrological processes and environments that can be explored using GSFLOW-GRASS and (2) how the tool kit's GIS domain builder can handle diverse topographic settings, including those prone to problems with standard GIS stream network tools (Table 3).Towards the first point, the specific examples chosen represent a range of practical applications for water and land management.Towards the second point, each simulation presents a unique set of technical challenges in developing a topographically based model domain that can properly route rainfall through a network of stream segments and subbasins as well as a connected groundwater flow grid.It is important to note, however, that no calibration effort was made to match field observations for these test cases.The simulation results thus serve as purely schematic examples based on certain settings and do not aim to capture actual conditions at the corresponding sites.
The three examples all contain complex hydrology with interactions between surface water and groundwater and are exemplars of practical management concerns.Together they span a range of environments: high to low elevations, steep to low-gradient catchments, coastal to inland settings, tectonically active to cratonal, and with partially to fully integrated drainage.Their catchment areas range from 12.5 to 3723.0 km 2 , covering the range of scales that GSFLOW was developed to simulate.These applications show that even before any parameter adjustments, the GSFLOW-GRASS tool kit can readily generate GSFLOW model domains and parameter inputs that   produce numerically convergent simulations in a variety of topographies and hydroclimatic conditions.Preliminary simulations with the default GSFLOW-GRASS provide a valuable springboard for the next step of performing the calibration needed to generate realistic model outputs for specific sites.The GSFLOW-GRASS toolbox can be customized to quickly generate additional model runs with varying input values to expedite the parameter calibration.It can also facilitate the implementation of sensitivity or other Monte Carlo-type analyses that are critical for identifying issues such as equifinality and over-parameterization and for determining uncertainty estimates (Beven, 2006;Gal-lagher and Doherty, 2007;Razavi and Gupta, 2015;Song et al., 2015).

Cannon River, Minnesota, USA
The Cannon River is a tributary to the upper Mississippi River in Minnesota, USA.Its headwaters cross low-relief uplands that are capped by low-hydraulic-conductivity glacial deposits (Patterson and Hobbs, 1995) and are intensively farmed (Kreiling and Houser, 2016).Its lower reaches pass through a valley cut into fractured carbonate bedrock that is popular for recreation.This combination of agricultural and recreational uses leads to a suite of management concerns re-lated to agricultural nutrients and fine sediments (Czuba and Foufoula-Georgiou, 2014) impacting both surface water and the underlying bedrock aquifers (Tipping, 2006;Steenberg et al., 2013), thus motivating the need for integrated hydrologic modeling tools.Furthermore, its large basin and highresolution topography make the Cannon River drainage basin an appropriate test site to evaluate the accuracy and computational efficiency of GSFLOW-GRASS across a range of HRU and MODFLOW grid cell areas.
We implemented GSFLOW-GRASS for the Cannon River watershed (Fig. 6) using the Minnesota statewide 1 m lidar data set (http://www.mngeo.state.mn.us/chouse/ elevation/lidar.html,last access: 22 November 2018), which we resampled to 15 m resolution.Meteorological data from nearby Zumbrota, Minnesota, were obtained from the Midwestern Regional Climate Center (Adresen et al., 2014).The flexible GSFLOW-GRASS input builder allows for easy implementation of two MODFLOW layers to represent an upper glacial till unit (low hydraulic conductivity) and the underlying fractured carbonate bedrock (higher hydraulic conductivity), which produced somewhat steeper hydraulic head gradients in the upper layer compared to the lower layer (Fig. 6c, d).
In this low-relief watershed (Table 2), Pleistocene glaciation produced nonintegrated upland drainage that presents computational challenges that are very different from those in steep watersheds.Much of the Cannon River watershed's postglacial topography is characterized by small localized hills and enclosed basins that have not yet been organized (i.e., integrated) by fluvial erosion into a linked valley network, in which water flows directly to a stream without encountering an enclosed depression (such as a lake, wetland, or dry basin).In such settings that lack integrated drainage, downslope flow-routing and pit-filling algorithms that are typically used to build hydrologic model domains (e.g., Bhatt et al., 2014;Maxwell et al., 2017;Gardner et al., 2018) can fail or produce spurious results by inappropriately modifying the real topography.As described in Sect.3.2.1,GSFLOW-GRASS determines surface-water flow using the GRASS GIS's efficient and accurate r.watershed extension, which implements a least-cost path algorithm designed to produce drainage networks that route flow in the longrange path of steepest descent regardless of the degree of local drainage integration.By using r.watershed alongside a set of new GRASS GIS extensions that integrate it into the GSFLOW framework, GSFLOW-GRASS is able to automatically create a topologically correct and linked drainage network for hydrologic model simulations in settings that lack integrated drainage.
The Cannon River watershed is by far the largest of the three model implementations and thus greatly benefits from the coarsened surface and subsurface domains (Table 2).To test the robustness and efficiency of GSFLOW-GRASS over different resolution configurations, we compared the accuracy and compute time of the Cannon River example case across 1 order of magnitude in threshold surface drainage area (for the HRU delineation) and 2 orders of magnitude in MODFLOW subsurface grid cell area, starting from the base case resolution shown in Fig. 7.Note that the threshold surface drainage area increase was limited by the total watershed size.Figure 7a shows that coarsening the irregular HRU resolution results in little (at finer MODFLOW resolution) to negligible (at coarser MODFLOW resolution) increase in error compared to coarsening the rectangular MODFLOW grid.This demonstrates that the accuracy of the GSFLOW-GRASS topographically based surface discretization is well www.geosci-model-dev.net/11/4755/2018/Geosci.Model Dev., 11, 4755-4777, 2018 maintained even with large-sized cells.Over the 2 orders of magnitude increase in rectangular MODFLOW grid cell sizes, errors steadily grow to about 35 %, but the GSFLOW-GRASS hydrological correction step (Sect.3.2.2) helped prevent even greater errors.The trade-off for accuracy is compute time: GSFLOW runtime is much more sensitive to the MODFLOW resolution than to the surface domain resolution (Fig. 7b).However, the domain builder algorithm -which requires longer compute times than the 5.5-year GSFLOW simulation for the Cannon River -is sensitive to both the surface drainage resolution and the MODFLOW grid cell area, and it is even more sensitive to the surface drainage resolution (Fig. 7c).GSFLOW-GRASS's fully automated surface drainage delineation thus allows users to overcome one of the most time-consuming obstacles to implementing integrated hydrologic models by constructing efficient and accurate irregular HRUs.The GSFLOW-GRASS tool kit makes it easy to carry out systematic GSFLOW configuration tests like this to assess model performance.
Comparisons between the simulated streamflow at the watershed outlet and corresponding observations at Welch, MN, reveal that without any parameter calibrations, the model produces realistic discharge during nonpeak flood times and during one of the observed peaks during July 1942 (Fig. 6e).The severely over-simulated discharge in July 1943 may be evidence for a local convective summer storm system passing over the Zumbrota weather station, which is located outside of the watershed boundary.Recurring failure of the model to capture April discharge indicates that snowmelt-related parameters require adjustment.The automated GSFLOW-GRASS tool kit can facilitate the application of parameter estimation approaches (e.g., Doherty, 1994;Poeter andHill, 1998, 1999).A calibrated model can then be used to evaluate the flushing of shallow groundwater -which is susceptible to leaching from overlying agricultural plots -into the river channels during major storms, as shown in Fig. 6b.
4.2 Water Canyon, Santa Rosa Island, California, USA Santa Rosa Island is part of the Channel Islands National Park in California, USA, and is characterized by mountainous topography (Clark et al., 1990).Hydrologic modeling of Santa Rosa Island has previously been performed by Jazwa et al. (2016), who applied the PIHM hydrologic model (Qu and Duffy, 2007) to the island in order to understand the relationship between prehistoric human settlement patterns and surface-water availability.They reported streamflow characteristics modeled for the 19 major drainages around the island during hypothetical climate regimes.Unlike PIHM, GSFLOW-GRASS employs a regular threedimensional groundwater grid that does not align with the irregular surface domain; this makes integrated domain building more complicated but allows for a flexible representation of the surface-water and aquifer systems.
Here we apply the GSFLOW-GRASS toolbox to model Water Canyon (Tables 2 and 3), one of the island's many drainages.We generated the surface flow-routing system with topography derived from a 3 arcsec SRTM DEM (Farr et al., 2007) projected to a UTM coordinate system at 90 m resolution, which was coarsened for both the surface and subsurface domains (Table 2).We drove simulations shown in Fig. 8 using weather data from the Western Regional Climate Center (http://wrcc.dri.edu, last access: 22 November 2018).
Water Canyon is unique among the three example sites in that its outflow drains to the ocean.It therefore requires GSFLOW-GRASS to accommodate irregular boundaries (coastlines) by properly assigning boundary conditions and routing flow through them.Users identify ocean pixels by assigning NULL values to them; this causes flow routing from r.watershed to stop at the shoreline.To allow flow out of a pour point at the mouth of the river, the immediately downgradient MODFLOW cell can be set as a constant head boundary, but this cell must be chosen carefully.The finite-difference scheme in MODFLOW dictates that the constant head boundary condition must be supplied along one of the four cardinal directions of the pour point.Therefore, if the river flows diagonally to the sea, its constant head boundary must be moved to the closest non-diagonal cell; v.gsflow.gridfinds the proper constant head boundary cell to set for the coastal case and for any inland drainage case in which the pour point also requires a downgradient constant head boundary.
Losing streams such as those in the steep and semiarid Water Canyon catchment often run dry (Jazwa et al., 2016).If this causes MODFLOW cells to lose all of their water, GSFLOW will fail to numerically converge.Thus, the Water Canyon example also serves to demonstrate GSFLOW-GRASS's ability to prevent this problem by (1) incorporating MODFLOW-NWT, which uses a Newton-Raphson solver for increased stability (Niswonger et al., 2011), (2) allowing the user to specify an adequately deep MODFLOW discretization in the settings file (Sect.3.1) to supply sufficient water through the dry season, and (3) hydrologically correcting the elevations of coarsened MODFLOW cells to ensure flow through the stream network (Sect.3.2.2).
The Santa Rosa example demonstrates an application in which GSFLOW-GRASS can be used to investigate and manage erosion associated with hydrological conditions.The example model input script included in GSFLOW-GRASS (Sect.3.3.3)generated a spatially distributed hydraulic conductivity field (Fig. 8d) that increases near the river channels; this can represent the transition between poorly sorted and silt-or clay-bearing hillslope soils and fluvial sands and gravels.Figure 8e demonstrates how the post-processing tools can be used to evaluate precipitation-triggered surface runoff (Fig. 8e), which can denude the hillslopes (Schumann et al., 2016) and transport eroded sediment through the drainage network (Fig. 8b).

Shullcas River watershed, Peru
The final test case is based on the Shullcas River watershed located in the central Peruvian Andes.Precipitation is highly seasonal, and water shortages are common during the dry season from May to September.The Huaytapallana Glacier, which supplies meltwater to the Shullcas River, is rapidly retreating (López-Moreno et al., 2014), causing concern over future water resources (Somers et al., 2018).However, in glacierized watersheds of the Peruvian Andes, a large proportion of the dry season stream discharge can be composed of groundwater (Baraer et al., 2015), driving the need to better understand groundwater-surface-water interactions in the catchment.Our choice of an example in the Peruvian Andes also demonstrates how our entirely open-source modeling system may be applied to problems in regions where financial limitations faced by local environmental researchers and practitioners make it difficult to use commercial software solutions.
The simple hydrologic model based on the Shullcas watershed covers a large elevation range and uses a coars- ened discretization based on an ASTER nominal 30 m resolution DEM (Tachikawa et al., 2011) (Table 2).Meteorological data were obtained from the Peruvian Meteorological Office (SENMAHI) online database.Located in the Andes, the Shullcas River watershed serves as an apt test bed for examining the ability of GSFLOW-GRASS to represent surface-water-groundwater links in steep terrain.Representing flow in steep topography and narrow canyons calls for high-resolution computations that can be infeasible to directly incorporate into an integrated hydrologic model.GSFLOW-GRASS solved this problem by converting highresolution flow-routing information for Shullcas into a far smaller number of topographically defined computational surface cells and coarsened MODFLOW grid cells (see Table 2) -a strategy that the multiple-configuration Cannon River test demonstrated can generate an efficient and accurate discretization (Fig. 7).The major challenge to domain coarsening further presented by Shullcas is its particular susceptibility to artificial "dams" due to its mountainous topography.These numerical artifacts occur when averaging elevations across flat valley floors and adjacent steep canyon walls, which can cause cells containing streams to be higher rather than lower than the surrounding cells on the MOD-FLOW grid.GSFLOW-GRASS's hydrological correction addresses this by enforcing integrated subsurface drainage (Sect.3.2.2),thus preventing improper accumulation and subsequent lateral leakage of water behind "dammed" stream cells (Fig. 4).
The Shullcas-based simulation does not include glacier melt, but spatiotemporal results in Fig. 9 show that GSFLOW can be useful for evaluating the potential of groundwater to buffer surface-water resources in mountainous watersheds.The model shows that essentially constant baseflow supports low but reliable discharge during the dry season (Fig. 9d).The GSFLOW-GRASS post-processing visualization tools were useful for depicting the accumulation of streamflow throughout the drainage network (Fig. 9b) and water table depths that were shallowest in low and flat areas (Fig. 9c).

Conclusions
To address the need for a fully automated and freely accessible software that handles the complete workflow for implementing complex hydrologic models, we have created GSFLOW-GRASS, a bundled tool kit for the coupled surface-water and groundwater model GSFLOW, using open-source Python scripts and GRASS GIS commands.GSFLOW-GRASS allows users equipped with a DEM, precipitation and temperature data, and basic knowledge about land surface and subsurface properties to efficiently construct Geosci.Model Dev., 11, 4755-4777, 2018 www.geosci-model-dev.net/11/4755/2018/watershed-scale hydrologic simulations.In order to create a robust tool that can be widely implemented over diverse hydro(geo)logic settings, we built a set of GRASS GIS extensions that automatically discretizes a topological surfacewater flow network that is linked with an underlying gridded groundwater domain.Our fully automated and generalized toolbox advances the accessibility of complex hydrologic software and will thus broaden the reach of integrated hydrologic models and their usage in both scientific research and practical resource management.
We have demonstrated GSFLOW-GRASS using three diverse examples based on topographies and climates from the intensively farmed Upper Midwest region of the United States, Santa Rosa Island off the coast of California, USA, and the water-stressed Andes.The results show that the new and automated GRASS GIS extensions can automatically and consistently build topologically complete linked surface and subsurface flow domains in settings that are typically challenging for standard GIS tools, including lowrelief terrains that lack integrated drainage, irregular coastal boundaries, and steep topographies.Although uncalibrated, these examples further demonstrate that GSFLOW-GRASS is a flexible tool for investigating the role of groundwatersurface-water interactions in imposing possible water-quality threats in agricultural and recreational watersheds, controlling runoff in erosion-prone landscapes, and modulating dry season discharge.
We designed GSFLOW-GRASS to strike a balance between direct "out-of-the-box" functionality and full flexibility for customizing model runs.A default implementation can be launched with no programming required by the user to readily produce preliminary uncalibrated simulations that can serve as a springboard for further model parameter adjustment through the fully commented tool kit scripts.A key feature of GSFLOW-GRASS is its use of all open-source software, enabling users anywhere to apply GSFLOW.We believe that the open-source platform will facilitate future toolbox enhancements through efforts by not only the original GSFLOW-GRASS developer team, but also new model users.We envision a number of new capabilities to tackle the grand challenge of handling spatial heterogeneity in integrated hydrologic models.Higher-resolution land surface variability could be achieved by further subdividing subbasins according to vegetation, soil type, or other geographic features to produce HRUs.Obtaining spatially variable information can be facilitated by linking GSFLOW-GRASS to existing regional and international databases for meteorology, soil and geologic properties, and land cover.Further calibration of spatially distributed parameters can be carried out by directly setting up GSFLOW-GRASS with a flexible inverse modeling code (e.g., Doherty, 1994;Poeter andHill, 1998, 1999).It is our hope that with its generalized form and open access, GSFLOW-GRASS can become a community tool that continues to grow to better solve hydrologic and water resource problems of both scientific and general management concerns.

Figure 1 .
Figure 1.Major features of the GSFLOW-GRASS geometry.(a)Each segment is one link in the network.At each node, two tributary segments combine to flow into a single segment.Each is numbered.They need not be in any particular order, as indicated, but a downstream-increasing numbering scheme is required for updated inflows to all segments to be computed during the same iteration.(b) Flow in each of the subbasin HRUs is routed directly to a corresponding stream segment.The arrow on the upper left indicates that flow from outside of the representative tributary junction may also be part of the drainage network.Our topological approach to defining HRUs allows HRUs to be numbered the same as the stream segments that they enclose.Our code is written in such a way that future developments can relax this symmetry.(c) MOD-FLOW operates on a grid that underlies the PRMS-based stream network and HRUs; each cell has a unique ID that is sequentially numbered.(d) "Gravity reservoirs" are defined by the intersection of the PRMS HRUs and the MODFLOW grid."Reaches" are defined as the section of each PRMS stream segment that lies within a single MODFLOW grid cell and are numbered sequentially downstream as shown.

Figure 2 .
Figure2.Soil-water storage reservoirs in the PRMS component of GSFLOW.Within each HRU, soil-water accounting calculations are carried out for three conceptual reservoirs in the order of increasing water storage and according to user-specified parameters.Climate forcing applies to the capillary reservoir; the gravity reservoir exchanges water with the deeper unsaturated and saturated zones represented by the MODFLOW component of GSFLOW, and Dunnian runoff and fast interflow occur in the preferential flow reservoir.(Adapted fromMarkstrom et al., 2008, Fig. 12.)

Figure 4 .
Figure 4. Water table depths (a) without and (b) with hydrologic correction.Circled areas display where the hydrologic correction produced continuous shallow groundwater through the channel network.Rectangles indicate where the hydrologic correction allowed water held behind artificial dams to drain.This example is from the Shullcas River watershed (Sect.4.3).

Figure 5 .
Figure 5.Our test sites include the high Andes, a mountainous island, and a formerly glaciated Mississippi tributary.

Figure 6 .
Figure 6.Model based on Cannon River, Minnesota, USA.(a) Map with MODFLOW grid, HRU outlines, stream segments (blue), and digital elevation model.(b) Simulated discharge after an 11 cm rainfall event.(c, d) Relatively low-gradient hydraulic head distributions in two MODFLOW layers representing an upper glacial till unit (low hydraulic conductivity) and lower fractured carbonate bedrock (higher hydraulic conductivity), with elevation contours (m a.s.l.).(e) A 3-year hydrograph showing uncalibrated discharge simulations matching observations reasonably well during nonpeak flood times but poorly during many of the actual peaks.

Figure 7 .
Figure 7. Changes in model output reproducibility, model runtime, and domain-building time with changes in both HRU resolution (drainage threshold) and MODFLOW resolution (MODFLOW grid cell area).(a) Root mean square error in basin discharge between each simulation and the highest-resolution simulation.(b) Time required to set up the input parameters and run the model.(c) Time required to build the domain.All computations were run on a Project Sputnik Dell XPS13 first-generation laptop with an Intel i7-4510U 2.0 GHz CPU and 8 GB of RAM running Ubuntu 18.04.

Figure 8 .
Figure 8. Model based on Water Canyon, Santa Rosa Island, California, USA.(a) Map with MODFLOW grid, HRU outlines, stream segments (blue), and digital elevation model.(b) Streamflow accumulation through the drainage network.(c) The modeled water table distribution with elevation contours (m a.s.l.).(d) Spatially variable hydraulic conductivity structure, with hydraulic conductivity increasing near the channel to represent alluvium and colluvium.(e) Simulated surface runoff contributions to catchment-wide discharge compared with precipitation.

Figure 9 .
Figure 9. Model based on Río Shullcas, Peru.(a) Map with MODFLOW grid, HRU outlines, stream segments (blue), and digital elevation model.(b) Streamflow accumulation through the mountainous drainage network.(c) The modeled water table distribution with elevation contours (m a.s.l.).(d) Seasonally variable precipitation and streamflow.

Domain builder (GRASS) Input file builder Run GSFLOW Visualization
, Visual Modflow (Waterloo

Table 2 .
Catchment and hydrological characteristics of the GSFLOW-GRASS example sites.Precipitation statistics from 12 May 1938 to 5 November 1943 (Cannon); 23 April 1990 to 27 September 2017 (Water Canyon); 26 August 2013 to 29 September 2016 (Shullcas)."Flow-routingcell size" is the original DEM resolution used to construct the stream network and irregular HRU cells, which are ultimately coarser sized ("Min.HRU area").CV: coefficient of variation.