Articles | Volume 16, issue 15
https://doi.org/10.5194/gmd-16-4481-2023
https://doi.org/10.5194/gmd-16-4481-2023
Model description paper
 | 
08 Aug 2023
Model description paper |  | 08 Aug 2023

DynQual v1.0: a high-resolution global surface water quality model

Edward R. Jones, Marc F. P. Bierkens, Niko Wanders, Edwin H. Sutanudjaja, Ludovicus P. H. van Beek, and Michelle T. H. van Vliet
Abstract

Maintaining good surface water quality is crucial to protect ecosystem health and for safeguarding human water use activities. However, our quantitative understanding of surface water quality is mostly predicated upon observations at monitoring stations that are highly limited in space and fragmented across time. Physical models based upon pollutant emissions and subsequent routing through the hydrological network provide opportunities to overcome these shortcomings. To this end, we have developed the dynamical surface water quality model (DynQual) for simulating water temperature (Tw) and concentrations of total dissolved solids (TDS), biological oxygen demand (BOD) and fecal coliform (FC) with a daily time step and at 5 arcmin ( 10 km) spatial resolution. Here, we describe the main components of this new global surface water quality model and evaluate model performance against in situ water quality observations. Furthermore, we describe both the spatial patterns and temporal trends in TDS, BOD and FC concentrations for the period 1980–2019, and we also attribute the dominant contributing sectors to surface water pollution. Modelled output indicates that multi-pollutant hotspots are especially prevalent across northern India and eastern China but that surface water quality issues exist across all world regions. Trends towards water quality deterioration have been most profound in the developing world, particularly sub-Saharan Africa and South Asia. The model code is available open source (https://doi.org/10.5281/zenodo.7932317, Jones et al., 2023), and we provide global datasets of simulated hydrology, Tw, TDS, BOD and FC at 5 arcmin resolution with a monthly time step (https://doi.org/10.5281/zenodo.7139222, Jones et al., 2022b). These data have the potential to inform assessments in a broad range of fields, including ecological, human health and water scarcity studies.

Dates
1 Introduction

Maintaining good surface water quality is important for protecting ecosystem health and ensuring human access to safe water resources for a diverse range of sectoral needs (Van Vliet et al., 2021; Jones et al., 2022a). For example, high organic pollution can reduce oxygen availability and can lead to the suffocation of aquatic organisms (Sirota et al., 2013), while pathogen pollution represents a potential health risk for people exposed to this water. The consumption of contaminated drinking water can lead to the transmission of diseases such as cholera, dysentery and polio, which cause an estimated 485 000 deaths annually (Prüss-Ustün et al., 2019). Another example is salinization of water resources, which can both limit irrigation water use (Thorslund et al., 2022) and threaten freshwater biodiversity (Velasco et al., 2019) where species cannot tolerate elevated salinity concentrations. Similarly, increased water temperatures can disrupt energy production (Van Vliet et al., 2016), and also provide more favourable conditions for cyanobacterial blooms that can lead to hypoxia, which can disrupt freshwater habitats (Smucker et al., 2021).

Human activity, both directly and indirectly, causes changes in surface water quality relative to ambient (“pristine”) conditions. Indirectly, altered precipitation patterns and the increased frequency of hydro-meteorological extremes that result from human-induced climate change can lead to fundamental changes in the hydrological regime (Wanders and Wada, 2015; Gudmundsson et al., 2021). Lower water levels due to altered seasonality patterns or droughts reduce the stream dilution capacity, which can increase the proportion of streamflow originating from (polluted) point sources (Wright et al., 2014; Luthy et al., 2015; Ehalt Macedo et al., 2022). Both of these factors increase river water contamination, threatening both the safe usability of water and environmental health. Climate change is also altering the thermal regime of rivers (Van Vliet et al., 2013), with higher temperatures also causing dissolved oxygen depletion (Ozaki et al., 2003).

More directly, sectoral activities generate return flows: water that is extracted for a specific purpose but is not consumed (evaporated) in the process but which has changed in composition as a result of the water use activity (Sutanudjaja et al., 2018; Jones et al., 2021). For example, the composition of domestic wastewater will reflect the various household water uses, including organic and fecal contamination from human waste (WWAP, 2017) and elevated nutrient concentrations from household chemicals and laundry detergents (Van Puijenbroek et al., 2019). The reintroduction of these flows back to the environment represents a significant source of pollutant loadings that degrade river water quality (Jones et al., 2022a). Collection and treatment of these flows before their reintroduction into the environment can help to minimize the impact on surface water quality (Jones et al., 2022a). However, these processes can be economically expensive to establish and operate, and hence collection and treatment infrastructure is not ubiquitous worldwide (Jones et al., 2021, 2022a).

Water quality is an integral part of the Sustainable Development Agenda, cross-cutting almost all Sustainable Development Goals (SDGs). Despite widespread recognition of its importance, water quality monitoring data are still severely lacking in several world regions – particularly Africa and central Asia (Damania et al., 2019). Furthermore, in regions where observation data are available, data are often sparse in both space and time. Water quality models offer opportunities to overcome these limitations (Hofstra et al., 2013; Beusen et al., 2015; UNEP, 2016; Van Vliet et al., 2021). As opposed to statistical models, which heavily rely on observed water quality data, physical models simulate the emission and transport of pollutant loadings along the river network directly based on climatic, hydrological and socio-economic input data. This makes physically based model approaches especially advantageous when simulating water quality in ungauged catchments and for projecting water quality under future (uncertain) climatic and socio-economic developments (Wanders et al., 2019).

A spatially and temporally detailed assessment of multiple water quality constituents at the global scale is lacking. Furthermore, only a few studies have quantitatively evaluated temporal dynamics and trends in water quality over extended time periods, particularly considering changes in factors that drive higher pollutant emissions (e.g. population growth, industrialization) relative to factors that abate pollutant emissions (e.g. wastewater treatment). Lastly, few studies have assessed the spatio-temporal patterns in the specific sectoral activities that are driving patterns in surface water quality worldwide.

Here, we present a high-spatio-temporal-resolution surface water quality model (henceforth DynQual), which can currently be used to simulate water temperature (Tw); concentrations of total dissolved solids (TDS) to represent salinity pollution; biological oxygen demand (BOD) to represent organic pollution; and fecal coliform (FC) as a coarse indicator for pathogen pollution. All simulations are provided at a daily time step with a spatial resolution of 5×5 arcmin (approx. 10 km at the Equator). DynQual considers a wide range of hydro-climatic and socio-economic drivers, spanning across the major contributing pollutant sources. The high-spatio-temporal-resolution of DynQual, combined with these features, allows the model to address scientific questions that are not currently possible using existing surface water quality models. For example, while previous work has compared pollutant loads (masses) originating from different sources at aggregated spatial scales (i.e. basin or subbasin level), the impact on in-stream concentrations – which is also dependent upon spatio-temporal variability in dilution capacity and in-stream decay processes – has not been assessed.

The objectives of this study are to (1) introduce a new open-source global surface water quality model and evaluate model performance; (2) assess spatial patterns and trends in surface water quality, focussing on total dissolved solids (TDS), biological oxygen demand (BOD), and fecal coliform (FC) concentrations for the period 1980–2019; and (3) demonstrate additional model capabilities by assessing the sector-specific contributions towards surface water pollution across both space and time.

2 Model description

2.1 General overview

The newly developed DynQual model builds on the modelling framework of DynWat, a global water temperature model that solves the energy–water balance to simulate daily water temperature (Tw) and ice thickness (Van Beek et al., 2012; Wanders et al., 2019). A full model description including the energy balance equations and the representation of ice cover, floodplains, channel roughness and lakes and reservoirs within DynWat is available in published literature (Wanders et al., 2019). DynQual further includes the impact of heat dumps produced in thermo-electric powerplants (Van Vliet et al., 2012a, 2021) on water temperature. In addition to water temperature, DynQual simulates daily in-stream concentrations of three water quality constituents, namely, total dissolved solids (TDS), biological organic matter (BOD) and fecal coliform (FC), which are of key social and environmental relevance (Van Vliet et al., 2021) (Fig. 1).

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f01

Figure 1Overview of the required input data for running DynQual in different model configurations. Runs coupled with PCR-GLOBWB2 require socio-economic (arrow 1) and climatic forcing (3, 4) data as standard, with options to either (1) estimate loads based on additional socio-economic (2) and simulated hydrological (6) data or (2) provide pollutant loadings directly as input data (8). Offline runs require both hydrological (5) and pollutant loading (8) input data to be provided directly.

Download

We also offer two options for running DynQual: (1) in a stand-alone configuration with specific discharge (i.e. baseflow, interflow and direct runoff in m d−1) fed from any land surface or hydrological model or (2) coupled with the global hydrological and water resources model PCR-GLOBWB2 (Sutanudjaja et al., 2018). The routine for surface water (and pollutant) routing follows an eight-point steepest-gradient algorithm across the terrain surface (local drainage direction) in a convergent drainage network with the lowermost cell connected to either the ocean or an endorheic basin as per PCR-GLOBWB2 (Sutanudjaja et al., 2018) and DynWat (Van Beek et al., 2012; Wanders et al., 2019). Routing within DynQual uses the kinematic wave approximation of the Saint-Venant equations with flow described by Manning's equation, solved using a time-explicit variable sub-time-stepping scheme based on the minimum Courant number (Sutanudjaja et al., 2018). In the coupled configuration, surface waters are subject to water withdrawals and return flows from the domestic, industrial, livestock and irrigation sectors calculated within the water use module of PCR-GLOBWB2. A complete model description of PCR-GLOBWB2 including detailed information on the model structure, individual modules (meteorology, land surface, groundwater, surface water routing and water use) and validation of hydrological output is available in published literature (Sutanudjaja et al., 2018). In both configurations of DynQual, pollutant loadings can be prescribed directly (akin to a forcing). Alternatively, when running DynQual coupled with PCR-GLOBWB2 pollutant loadings can be simulated within the model runs by providing only simple input data (Sect. S1 in the Supplement). An overview of DynQual, which details the input data required for the different model configurations, is displayed (Fig. 1). By providing these options, we allow for flexibility – allowing pollutant loadings to be directly imposed on the model enables users to estimate loadings using their preferred methodology and assumptions, whereas the option to estimate pollutant loadings within the model run enables users to simulate water quality without any pre-processing requirements but still provides flexibility to use their preferred input datasets. Parameter values related to pollutant emissions can be adjusted by the user as desired. When simulating pollutant loadings within model runs, it is also possible to quantify the contribution and relative importance of different water use sectors to the spatial patterns and temporal trends in surface water quality.

As per PCR-GLOBWB2 (Sutanudjaja et al., 2018) and DynWat (Wanders et al., 2019), DynQual is written in Python 3 and is run using an initialization (.ini) file in which key aspects of the model run are defined (e.g. spatial extent, simulation period, paths to parameter and forcing files). Most input files required and all output files are in NetCDF format. Global 5 arcmin DynQual runs that are coupled with PCR-GLOBWB2 have a wall-clock time of approximately 6 h yr−1 when run with parallelization due to the requirement to use the kinematic wave routing option for higher-accuracy discharge and water temperature simulations. This is approximately equivalent to the PCR-GLOBWB2 run times given by Sutanudjaja et al. (2018). DynQual runs performed in the stand-alone configuration are faster ( 20 %).

2.2 Water quality equations

2.2.1 Water temperature (Tw)

Water temperature (Tw) is simulated by solving the surface water energy balance using the DynWat model as basis (Van Beek et al., 2012; Wanders et al., 2019). In addition to solving the surface water energy balance, DynWat also accounts for surface water abstraction, reservoirs, riverine flooding and the formation of ice (Wanders et al., 2019). Here, we further develop DynWat to include advected heat flows from thermo-electric powerplants, as per the method described in van Vliet et al. (2012b, 2016). The modelling equations for Tw incorporated into DynQual are shown in Eq. (1) and are fully elaborated on in previous work (Van Beek et al., 2012; Van Vliet et al., 2012a, 2016; Wanders et al., 2019):

(1) ρ w C p ( h T w ) t = ρ w C p ( v T w ) x + H tot + ρ w C p x = 0 d x q s T s + T w pow n h w x H tot = S in 1 - a w + L in - L out - H - L E T w pow n = ρ w C p RF pow , n Δ T pow _ rf ,

where t is time, x is location along the drainage network, Tw is water temperature (K), Cp is the specific heat capacity of water (4190 kg−1 K−1), ρw is the density of fresh water (1000 kg m−3), h is the stream water depth (m), v is the velocity of water (m s−1), Htot is the heat flux at the air–water interface, Sin is the incoming shortwave radiation (J m−2 s−1), 1−aw is the reflected shortwave radiation (J m−2 s−1), Lin is the incoming longwave radiation (J m−2 s−1), Lout is the outgoing longwave radiation (J m−2 s−1), His the sensible heat flux (J m−2 s−1), LE is the latent heat flux (J m−2 s−1), qs is the lateral water fluxes from land to stream (m s−1), Ts is the temperature of lateral water fluxes (K), Twpown is the heat dump from thermo-electric powerplants (J s−1), RFpow is the return flows of cooling water from thermo-electric powerplants (m3 s−1), ΔTpow_rf is the difference in water temperature between the return flows and ambient river water (K), w is the stream width (m), and dx is the distance between grid cell n and the upstream grid cell n−1 (m).

2.2.2 Conservative (TDS) and non-conservative (BOD, FC) substances

Our modelling strategy for total dissolved solids (TDS), biological oxygen demand (BOD) and fecal coliform (FC) is a mass balance approach assuming transport by advection only, whereby sector-specific loadings (i.e. masses of pollutants generated from a particular human activity in a given time period) are accumulated from all contributing sectors and routed through the global stream network until outflow to the ocean or an endorheic basin (Thomann and Mueller, 1987; Chapra et al., 2008; Voß et al., 2012; UNEP, 2016; Van Vliet et al., 2021).

TDS is modelled as a conservative substance, while BOD and FC are modelled as non-conservative substances that include first-order decay processes (Voß et al., 2012; Reder et al., 2015; UNEP, 2016; Van Vliet et al., 2021). Our approach for both the conservative and non-conservative substances assumes instantaneous and full mixing of all streamflow and return flows in each grid cell. As per most water quality models, DynQual simulates water quality per individual grid cell over a consecutive series of discrete time periods (Loucks and Beek, 2017). Each grid cell represents a volume element, which is in steady-state conditions within each time period and also contains a (fully mixed) pollutant mass (Fig. 2). In each consecutive time step, there is an associated volume of water and mass of pollutant that flows into the grid cell from upstream and that flows out of the grid cell to the downstream grid cell. For non-conservative substances, there are also grid-cell-specific in-stream decay processes that influence the total mass of pollutant in each sub-time interval. DynQual simulates these transport and decay processes with a sub-daily interval (Δt in seconds), the length of which is determined with respect to channel characteristics and discharge (Sect. S2 and Eq. S9 in the Supplement).

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f02

Figure 2Schematic overview of DynQual, including a translation of the local hydrological and socio-economic situation  into a local drain direction (LDD) map that includes hydrological and pollutant fluxes and a representation of the grid-cell-based processes (pollutant emission calculation, routing procedure and computation of pollutant concentrations) in an individual DynQual grid cell. Cp,n is the concentration of pollutant p (e.g. mg L−1), while Mp,n is the total mass of pollutant p (e.g. g) and Vn is the channel storage (m3), all of which are in grid cell n. Vnt=0 is the volume of channel storage from the previous time step (m3), while Qn-1n and Qnn+1 are the discharge (m3 s−1) into and out of grid cell n, respectively, per time step Δt. Mp.nt=0 is the mass of pollutant p from the previous time step, while RLpn-1n and RLpnn+1 are the loadings of pollutant p (e.g. g s−1) that are routed into and out of grid cell n, respectively, per time step Δt. Lp,n are the combined local loadings of pollutant p (e.g. g d−1) in grid cell n, which is the sum of loadings from all contributing sectors and urban surface runoff. kp,n represents a decay coefficient, which depends upon pollutant p (–). D is the length of a day in seconds (i.e. 86 400 s d−1), while Δt is the length of the sub-time step (s), which is linked to the internal routing regime within DynQual and PCR-GLOBWB2. Pn is precipitation (m3 d−1), and En is evapotranspiration (m3 d−1), with these terms included as an example of grid-cell-specific hydrological fluxes. For a more detailed overview of the hydrological fluxes within a grid cell we refer to the PCR-GLOBWB 2 documentation (Sutanudjaja et al., 2018).

Download

The pollutant concentration at each subsequent time interval (tt) is calculated following Eq. (2). It should be noted that, while we simulate the terms of this equation with a sub-daily time step interval, DynQual only reports concentrations in the final sub-daily interval of each day. This is due to the lack of sub-diurnal input data, for efficient data storage and the lack of relevance of such high-resolution simulations with respect to our large-scale modelling approach.

(2) C p , n t + Δ t = M p , n t + Δ t V n t + Δ t + BG i , n ,

where Cp,nt+Δt and Mp,nt+Δt are the concentration and mass, respectively, of pollutant p in grid cell n at the consecutive time interval (tt), whereas Vnt+Δt is the volumetric channel storage (m3) in this grid cell in the same interval. Vnt+Δt is simulated directly within PCR-GLOBWB2, accounting for the initial storage, discharge into and out of grid cell n over the time interval Δt, and grid-cell-specific hydrological fluxes including precipitation and evapotranspiration (Sutanudjaja et al., 2018). Mp,nt+Δt is simulated by solving the mass balance equation for pollutant p and accounting for in-stream decay processes following Eq. (3). BGp,n represents the background concentration of pollutant p in grid cell n. For TDS, these are estimated based on minimum observed electrical conductivity (EC) converted to TDS observations (Walton, 1989) contained in a new global salinity dataset (Thorslund and Van Vliet, 2020) and are applied as a constant background concentration. Conversely, BGBOD,n and BGFC,n are assumed to be negligible relative to the mass of pollution produced by anthropogenic activities.

(3) M p , n t + Δ t = M p , n t = 0 + RL p n - 1 n - RL p n n + 1 + L p n D Δ t e - k p , n ( Δ t D ) ,

where at the subsequent time step interval (tt) each grid cell n contains the mass of pollutant p from the previous time step (Mp,nt=0) plus the pollutant load (mass s−1) that has been transported from the immediately (adjacent) upstream grid cell(s) (RLpn-1n) and minus the pollutant load (mass s−1) that has been transported downstream (RLpnn+1) in the time interval Δt (s). Lp,n represents the daily influx of pollutant loadings produced into grid cell n (mass per day), which are added to the stream in equal increments per sub-daily time step Δt (s) relative to the total length of a day D in seconds (i.e. 86 400 s d−1). Our approach for adding local pollutant loadings in equal increments per sub-daily time step is necessary as we lack information regarding the (sub-diurnal) timing at which pollution enters the stream network.

The variable kp,n represents a pollutant-specific p and grid-cell-specific n decay rate (d−1). While we model TDS as a conservative substance (i.e. kTDS,n=0), we determine the first-order degradation rate of BOD (kBODn) as a function of water temperature (Eq. 4) and of FC (kFCn) as function of water temperature, solar radiation and sedimentation (Eq. 5). Decay is implemented directly into DynQual by assuming that decay occurs at an equal rate over the course of a day (ΔtD). This assumption is necessary because we do not have sub-daily input data for some terms of the decay equations, such as water temperature (Tw) and incoming solar radiation (Io).

(4) k BOD , n = k 20 Θ ( T w n - 20 ) ,

where k(20) is a first-order degradation rate coefficient at 20 C (d−1) assumed at 0.35 (Van Vliet et al., 2021), Twn is the water temperature (C) in grid cell n and Θ is a temperature correction assumed to be 1.047 as per previous assessments (Wen et al., 2017; Van Vliet et al., 2021).

(5) k FC n = k d Θ ( T w n - 20 ) + k s I o k e H 1 - e - k e H + v H ,

where kd is dark inactivation (d−1), Θ is a temperature correction, Twn is the water temperature (C) in grid cell n, ks is sunlight inactivation (m2 W−1), Io is the surface solar radiation (W m−2), ke is an attenuation coefficient (m−1), H is stream depth (m) and v is the settling velocity (m d−1). Parameter values (Table 1) and mean basin average total suspended solids (Beusen et al., 2005) are based off previous fecal coliform modelling studies (Reder et al., 2015). Parameter values, including decay coefficients, can alternatively be defined by the user directly in the source code.

Table 1Assumed parameter values for fecal coliform modelling.

Download Print Version | Download XLSX

2.3 Pollutant loadings

In both model configurations (stand-alone or coupled to PCR-GLOBWB2), user-defined pollutant loadings can be directly imposed on the model (akin to a forcing). Users can estimate pollutant loadings using their preferred methodology, and subsequently route these through the global stream network, account for in-stream decay processes and calculate in-stream pollutant concentrations using the DynQual model framework. Pollutant loadings that are prescribed to DynQual directly should have a daily temporal resolution (e.g. g d−1 or 106 cfu d−1; note that “cfu” indicates “colony forming units”.).

Alternatively, when running DynQual coupled with PCR-GLOBWB2, pollutant loadings (with a daily temporal resolution) can be simulated within the model runs, requiring only simple input data (Fig. 1 and Sect. S1). This option is beneficial for users that do not have pre-calculated pollutant loadings. Furthermore, this option may be useful for those interested in scenario modelling, as input files related to different scenarios can be altered to reflect alternative climate and socio-economic conditions.

In this set-up, DynQual estimates and routes pollutant loadings individually and combined for the main water use sectors (domestic, manufacturing, livestock and irrigation) and from urban surface runoff at 5 arcmin spatial resolution. Loadings from the domestic sector are estimated by multiplying the gridded population with region-specific per capita excretion rates (Sect. S1.1, Table S1 in the Supplement). For the manufacturing sector, a mean effluent concentration is multiplied by location-specific gridded estimates of return flows from the manufacturing sector (Sect. S1.2, Table S2). Urban surface return flows are approximated by multiplying surface runoff (simulated by PCR-GLOBWB2) with the gridded urban fraction, which are multiplied by a region-specific mean urban surface runoff effluent concentration (Sect. S1.3; Table S3). The livestock sector is sub-divided into “intensive” and “extensive” production systems based on livestock densities to better account for differences in the paths by which waste enters the stream network (Sect. S1.4, Table S4). Gridded livestock numbers for buffalo, chickens, cows, ducks, goats, horses, pigs and sheep are multiplied by pollutant excretion rates per livestock type and by region (Sect. S1.4, Tables S5–S7). TDS loadings from the irrigation sector are estimated by multiplying irrigation return flows simulated by PCR-GLOBWB2 with spatially explicit mean irrigation drainage concentrations based on salinity (as indicated by electrical conductivity) over the topsoil and sub-soil (Sect. S1.5). Thermal effluents (heat dumps) from thermoelectric powerplants are included as point sources of advected heat by considering the temperature difference between the flows and ambient surface water temperature conditions (Sect. S1.6). Pollutant loadings from the domestic, manufacturing and intensive livestock sectors and from urban surface runoff are abated based on grid-cell-specific wastewater practices. The proportion of pollutant loadings removed by wastewater treatment practices is estimated by multiplying the fraction of each treatment level occurring in a grid cell by the pollutant removal efficiency associated with that treatment level, as described in detail in previous work (Jones et al., 2021, 2022a).

A detailed explanation of how pollutant loadings are estimated within DynQual is provided in Sect. S1, including equations (Eqs. S1–S8), data sources and all parameter estimates (Tables S1–S7).

3 Model demonstration

3.1 Model run setup

DynQual is run for the time period 1980–2019 using W5E5 forcing data (Cucchi et al., 2020; Stefan et al., 2021) in the configuration coupled with PCR-GLOBWB2. We used the standard parameterization of PCR-GLOBWB2 for hydrological simulations, as described in previous work (Sutanudjaja et al., 2018). The focus of our model demonstration is on TDS, BOD and FC, as results for Tw have been displayed extensively in previous work (Wanders et al., 2019). Pollutant loadings of TDS, BOD and FC are estimated within the model run at the daily time step using input data summarized in Table 2 and as detailed in Sects. 2.3 and S1. Both the meteorological forcing data and input data used for simulating pollutant loadings used in this study are accessible through links provided. We also provide the model code and full input data required for running an example catchment (Rhine basin) in the “Code and data availability statement”.

Table 2Summary of key input data used for the estimation of pollutant loadings in the presented model application.

Download Print Version | Download XLSX

As per PCR-GLOBWB2 (Sutanudjaja et al., 2018), in addition to the original water temperature model DynWat (Wanders et al., 2019), no calibration was performed. The process-based nature and global scale of DynQual, combined with strong spatial biases in observations (Fig. S2) and the large number of parameters that need to be estimated, complicate meaningful calibration. In addition, uncalibrated physical models can theoretically be applied in ungauged basins without loss of performance and are more preferable for global change assessments with different climatic and socio-economic scenarios (Hrachowitz et al., 2013; Wanders et al., 2019).

3.2 Model evaluation

Model simulations were compared to observations from surface water quality monitoring stations worldwide at daily temporal resolution. Observed data were obtained from various state-of-the-art databases (Sect. S3.1). Water quality monitoring data cover the entire modelled time period (1980–2019) and include a far greater number of observations than in previous surface water quality modelling validation procedures (Table S8). However, monitoring stations are unevenly distributed across space, with a strong bias towards North America and western Europe for all water quality constituents (Fig. S2). Furthermore, observations at monitoring stations are highly fragmented across time, particularly for BOD and FC.

The overarching purpose and applications of a model, including large-scale water quality models (Beusen et al., 2015; UNEP, 2016), must be considered both for determining suitable metrics for model evaluation and for judging model performance. Given the approximations in the model, uncertainties in input data and the overall complexity in the drivers of pollutant loadings, the purpose of global water quality models is not to compute daily concentrations exactly (UNEP, 2016). The modelling strategy is thus to focus on the main spatial and temporal drivers of pollution in river networks globally to facilitate first-order approximations of in-stream concentrations. A key reason for implementing DynQual at 5 arcmin spatial resolution is due to the marked improvement of the performance of both PCR-GLOBWB2 (e.g. discharge) (Sutanudjaja et al., 2018) and DynWat (e.g. water temperature) (Wanders et al., 2019) at finer spatial extents. These two factors have an important influence on simulated in-stream concentrations due to dilution and in-stream decay processes, respectively.

Given these factors, combined with limitations in the observational records of surface water quality (Sect. S3.1), global water quality models have typically not been evaluated with metrics commonly used for hydrological modelling such as coefficients of determination, Nash–Sutcliffe efficiency (NSE) and Kling–Gupta efficiency (KGE) (Voß et al., 2012; Beusen et al., 2015; UNEP, 2016; Wen et al., 2017; Van Vliet et al., 2021), with the exception of water temperature simulations (Van Vliet et al., 2012b; Wanders et al., 2019). The model evaluation approach adopted for DynQual combines methods applied for the evaluation of other global water quality modelling efforts. Simulated TDS, BOD and FC concentrations are evaluated with respect to pollutant classes linked to key sectoral water quality thresholds (UNEP, 2016; Wen et al., 2017) (Sect. S3.2.1; Table S9) and statistically using normalized root-mean-square error (nRMSE) (Beusen et al., 2015; Van Vliet et al., 2021) (Sect. S3.2.2; Eq. S11). This provides an indication of prediction errors across the different water quality constituents comparable with previous large-scale water quality assessments. Conversely, the quality of water temperature simulations is evaluated using KGE (Sect. S3.2.2; Eq. S10). All four water quality constituents are also evaluated by considering long-term time series and multi-year annual cycles at individual monitoring stations (Sect. S3.2.3), which we present for the station with the most data availability across all four constituents (see Fig. 5 for a station in the Mattaponi River in the USA) and for a selection of additional monitoring stations per water quality constituent (Figs. S5–S8).

Overall, a strong correspondence between simulated and observed concentrations classes is found, indicating that the model is (largely) able to simulate concentrations within the correct concentration range (Fig. 3). The simulated concentration class matches the observed concentration class exactly in 69 %, 51 % and 44 % of instances for TDS, BOD and FC, respectively. When considering ±1 pollutant class, these percentages rise to 92 %, 79 % and 79 %. Of the mismatches in simulated and observed concentration classes, DynQual tends to under-estimate TDS and BOD concentrations relative to observed in-stream concentrations (i.e. difference in classification level  1). This occurs for 75 % of mismatches in simulated TDS classes and 69 % of mismatches in BOD classes. Conversely, FC mismatches occur both for under-estimates (57 % of cases) and over-estimates (43 % of cases) in more equal proportions.

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f03

Figure 3Differences in observed vs. simulated pollutant classes for (a) total dissolved solids (TDS), (b) biological oxygen demand (BOD) and (c) fecal coliform (FC). Pollutant classes are defined based on water use and ecological limitations, as stated by governmental and international organizations. A difference in classification level of “0” indicates the simulated pollutant class matches the observed pollutant class, while negative differences indicate that observed concentrations exceeded simulated concentrations and vice versa for positive differences.

Download

Statistical evaluation of the water temperature simulations using the KGE coefficient demonstrates the strong performance of DynQual (Fig. 4a) across all world regions (Fig. S3). Across all observation stations, a median KGE of 0.72 is found (25th percentile is 0.52, 75th percentile is 0.83), with 32 % of stations with KGE > 0.8, 83 % of stations with KGE > 0.4 and 99 % of stations with KGE values exceeding the performance threshold of >-0.41 (Knoben et al., 2019). Detailed time series of individual rivers also demonstrate the ability of DynQual to closely replicate observed water temperature at the daily time step, in addition to seasonal patterns, across different world regions (Figs. 5a, S5). A detailed evaluation of water temperature simulations is available in previous work (Wanders et al., 2019).

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f04

Figure 4Evaluation of model performance using the Kling–Gupta efficiency (KGE) coefficient for (a) water temperature (Tw) and normalized root mean square error (nRMSE) for (b) total dissolved solids (TDS), (c) biological oxygen demand (BOD) and (d) fecal coliform (FC) simulations. Spatial patterns in KGE for Tw (Fig. S3) and nRMSE for TDS, BOD and FC (Fig. S4) are displayed in Sect. S3.2.2.

Download

The distribution of nRMSE values, sub-divided by annual average river discharge, for TDS, BOD and FC is displayed in Fig. 4b–d. Statistical evaluation of the simulations using nRMSE shows mixed results. A median nRMSE value of 0.76 is found for TDS across all observation stations, with a 25th percentile of 0.79 and a 75th percentile of 1.83 (Fig. 4b). For BOD simulations, a median nRMSE of 0.98, 25th percentile of 0.76 and 75th percentile of 1.25 is found (Fig. 4c). A large spread is found for nRMSE values for FC simulations, with a median of 1.89, a 25th percentile of 1.16 and a 75th percentile of 3.53 (Fig. 4d). Simulated TDS concentrations are typically lower than observations in many locations that are proximate to the coastline, presumably due to a combination of background TDS concentrations based upon minimum observations (and applied constantly) and DynQual not accounting for the influence of saltwater intrusion. This may somewhat explain the long tail (nRMSE > 10) in the histogram for TDS (Fig. 4b) and the disproportionate tendency of DynQual to simulate TDS concentrations that are lower than observed concentrations (Fig. 3). Overall, no strong spatial patterns are found in the distribution of nRMSE values of BOD (Fig. S4b) and FC (Fig. S4c). For these water quality constituents, model simulations tend to represent the observed data better in larger streams (> 100 m3 s−1). This is likely due to the influence of spatial mismatches between monitoring station locations and model simulations being especially important in smaller streams, where concentrations are more sensitive to natural dilution capacity (i.e. water availability) and variabilities in pollutant source contributions.

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f05

Figure 5Time series (left) and average annual cycles (right) of observed vs. simulated surface water quality as indicated by (a) water temperature (Tw; C), (b) total dissolved solid (TDS; mg L−1) concentrations, (c) biological oxygen demand (BOD; mg L−1) concentrations, and (d) fecal coliform (FC; cfu 100 mL−1) concentrations at an example water quality monitoring station. In the time series plots, observations are indicated by blue crosses, daily simulations are indicated by grey lines and 30 d running averages are indicated by red lines. In the average annual cycle plots, blue and red lines indicate the median observed and simulated values, respectively, while the shading represents the range in values as indicated by the 10th and 90th percentiles. More examples for Tw (Fig. S5), TDS (Fig. S6), BOD (Fig. S7) and FC (Fig. S8) across different world regions are displayed in Sect. S3.2.3.

Download

Long-term time series and average annual cycle plots for TDS (Figs. 5b, S6), BOD (Figs. 5c, S7) and FC (Figs. 5d, S8) show that DynQual can generally simulate in-stream concentrations within the correct range (e.g. min–max daily concentrations, 10th and 90th percentile average annual cycles). Simulated concentrations at the example monitoring station (Fig. 5) display that TDS, BOD and FC concentrations are largely simulated within plausible limits with strong overlaps in the average annual cycles, but the exact correspondence between observed and simulated concentrations at the daily time step is relatively poor. For this observation station, simulated peaks in daily TDS, BOD and FC concentrations tend to exceed those in the observational record. However, given the incomplete nature of the observed records, it is problematic to draw conclusions on whether these concentrations are plausible but unrecorded or if DynQual is simulating unrealistic peak concentrations. For example, while DynQual captures some of the peaks in observed daily BOD concentrations, simulated BOD concentrations exceed those in the observational record while simultaneously under-predicting average annual cycles in BOD concentrations (Fig. 5). This pattern is also observable in TDS concentrations in the Mersey River (Fig. S6) and FC concentrations in the Exe River (Fig. S8).

While strong seasonality is present in the Tw observations, which is well captured by DynQual (Figs. 5a, S5), and in TDS concentrations to a lesser extent (e.g. Mersey and Komati rivers in Fig. S6), there is an overall lack of strong seasonal patterns in the observed records of BOD and FC concentrations. This, combined with large variability in the observed concentrations, results in large uncertainty in average annual cycles of observed concentrations across all months, as indicated by 10th and 90th percentiles (Figs. 5c–d, S7–S8). Annual average cycles in observed and simulated concentrations tend to strongly overlap for both BOD and FC. However, seasonal patterns are more evident in BOD simulations than observations (e.g. Mersey, Periyar in Fig. S7), and the large variability in observed FC concentrations is not replicated by DynQual daily simulations (e.g. Cauvery, Rhine in Fig. S8). In the case of FC concentrations, for example, this could suggest that DynQual misses or under-represents the importance of pulse disturbances (e.g. high rainfall events causing sewer overflows) on the transport of pollutants to surface waters.

3.3 Spatial patterns

The spatial patterns in TDS (Fig. 6), BOD (Fig. 7) and FC (Fig. 8) concentrations show substantial variations both within and across world regions, driven by different sectoral activities (Fig. 9). The dilution capacity of rivers is also a major determinant of in-stream concentrations. Averaged at the annual timescale this is particularly evident for BOD and FC, where the large dilution capacity of some major rivers is sufficient to dilute concentrations to relatively low levels, despite often being fed by more polluted tributaries. However, it should also be noted that both river discharges and in-stream concentrations can exhibit substantial intra-annual variability, thus pollutant hotspots and the magnitude of pollutant levels must also be considered at finer temporal scales than presented here. Intra-annual variability can occur in the model due to temporal variations in (1) pollutant loadings, (2) water availability (i.e. dilution capacity) and (3) in-stream decay processes.

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f06

Figure 6Annual average total dissolved solids (TDS) concentrations for the period 2010–2019 plotted for rivers with > 10 m3 s−1 annual average discharge.

TDS concentrations show strongly regional patterns, with key hotspots of salinity pollution located in South Asia (Pakistan and northern India) and eastern China and to a lesser degree across the United States and Europe (Fig. 6). High TDS concentrations in South-East Asia are predominantly driven by the irrigation sector and the presence of saline soils (Fig. 9a). While the irrigation sector is also an important driver of TDS pollution in eastern China, the contribution from manufacturing activities is also substantial (Fig. 9a). The manufacturing sector is the dominant contributor of TDS pollution across most of North America and western Europe, accounting for > 75 % of in-stream pollutant loadings in almost all major river segments in these regions (Fig. 9a). Aside from the lower Nile, where salinity pollution is predominantly from the manufacturing sector, the domestic sector is the key source of (non-natural) TDS loadings in Africa. However, it should be noted that, aside from in the lower Nile, TDS concentrations are simulated to be relatively low across most of Africa (Fig. 6).

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f07

Figure 7Annual average biological oxygen demand (BOD) concentrations for the period 2010–2019 plotted for rivers with > 10 m3 s−1 annual average discharge.

While BOD concentrations show considerable diversity across the major world regions, a substantial proportion of river segments across populated areas of all continents experience moderate-to-high organic pollution (Fig. 7). There are clear spatial patterns in the dominant sectoral activities contributing BOD loadings worldwide, and it also evident that BOD pollution in most world regions is driven by a combination of multiple sectors opposed to from an individual dominant activity (Fig. 9b). Across Europe in particular, which sector is dominant varies both spatially and temporally, and the contribution from the dominant sector is typically < 50 % (Fig. 9b). The manufacturing sector is the most significant source of BOD pollution across rivers in the United States; however, the relative contribution commonly falls in the 20 %–50 % or 50 %–75 % categories (Fig. 9b). In the most polluted world regions, South Asia and South-East Asia, the domestic sector is typically dominant. However, there are also significant contributions from manufacturing and extensive livestock activities (Figs. 7, 9b). Lastly, while its influence is highly localized, urban surface runoff can also represent an important source of BOD pollution in heavily urbanized grid cells across all world regions.

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f08

Figure 8Annual average fecal coliform (FC) concentrations for the period 2010–2019 plotted for rivers with > 10 m3 s−1 annual average discharge.

FC pollution is particularly high across South Asia and South-East Asia, with more localized hotspots found in parts of western Latin America, southern Europe, Middle East and eastern Africa (Fig. 8). Similar to BOD pollution, a large proportion of stream segments in South Asia and South-East Asia are heavily polluted, with typically only rivers with extremely high dilution capacities appearing in the lower concentration classes. In this region, the domestic sector is predominantly responsible for FC pollution (commonly > 75 %), attributed to large urban populations coupled with a large proportion of domestic wastewater being inadequately treated (Fig. 9c). In countries with high municipal wastewater collection and treatment rates, such as in Europe, the relative influence of livestock activities tends to be larger. While manufacturing activities remain the dominant source of FC pollution in North America, despite relatively high wastewater treatment rates, the percentage contribution is typically < 50 % and livestock activities also represent an important source of FC loadings (Fig. 9c). Despite variable municipal wastewater collection and treatment rates across Latin America, livestock activities appear to dominate FC loadings outside of the Amazon basin (Fig. 9c). This can be attributed to very high livestock numbers (particularly cattle), combined with the fact that the most of the large urban settlements (and thus domestic FC pollutant loadings) in South America are located in the coastal zone. As such, pollution from the domestic and manufacturing sectors typically enter the river network at downstream locations causing localized pollution before outflow to the ocean.

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f09

Figure 9Dominant sectoral activity contributing towards (a) total dissolved solids (TDS), (b) biological oxygen demand (BOD) and (c) fecal coliform (FC) pollution averaged over 2010–2019 plotted for rivers with > 10 m3 s−1 annual average discharge.

3.4 Trends

Long-term trends in TDS, BOD and FC concentrations over the simulated period (1980–2019) are also presented (Fig. 10). TDS concentrations in most world regions are either relatively constant or show relatively upward gradual trends (Fig. 10a). Typically, where TDS concentrations are increasing, the trend has been driven mainly by expansions in manufacturing or irrigation activities. Comparatively, trends in BOD (Fig. 10b) and FC (Fig. 10c) concentrations are larger in magnitude and exhibit substantially more spatial variation across the major world regions. Regionally, the strongest increases in BOD and FC concentrations are found in sub-Saharan Africa, where wastewater treatment rates are low, and South Asia, where the rate of population growth and economic development has significantly outstripped the expansion of wastewater treatment infrastructure. Strong increasing trends are also found across most of Latin America, where a significant proportion of collected wastewater does not undergo wastewater treatment (UNEP, 2016; Jones et al., 2021). BOD and FC concentrations across North American rivers have typically remained relatively constant or exhibit small decreasing trends. Strong decreasing trends are found across Europe, including the Danube and Rhine basins. In all world regions, the influence of reservoirs on BOD and FC concentrations is also evident, with increased water volumes (i.e. dilution) coupled with longer residence times (i.e. greater decay) reducing BOD and FC concentrations at these specific locations.

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f10

Figure 10Average annual percentage changes in (a) total dissolved solids (TDS), (b) biological oxygen demand (BOD) and (c) fecal coliform (FC) concentrations for the period 1980–2019 plotted only for rivers with > 10 m3 s−1 annual average discharge.

Complementary to the spatial analysis, we considered the proportion of the population that inhabits grid cells exhibiting different trends in pollutant concentrations, aggregated by geographical region and economic classification (Fig. 11). It should be noted that trends (Figs. 10 and 11) are not indicative of the degree of pollution directly and thus should also be considered with respect to in-stream concentrations (Figs. 6–8). Changes in TDS concentrations in the most populated areas worldwide are typically low, with increases of 0 %–1 % most common across all geographical regions (Fig. 11a). Conversely, strong regional patterns are evident for BOD (Fig. 11b) and FC (Fig. 11c) concentrations. Particularly in sub-Saharan Africa and South Asia, BOD and FC concentrations in populated locations have been almost exclusively increasing. Over half of the population of sub-Saharan Africa live in areas where BOD and FC concentrations have increased (on average) by > 2 % yr−1 from 1980–2019. Conversely, in western Europe, trends in BOD and FC have been negative for areas where 60 % of the population lives.

When aggregating trends by country-specific economic classifications, trends in TDS, BOD and FC pollutant concentrations all display a clear correlation with level of economic development (Fig. 11). For the water quality constituents considered, the strongest and most widespread decreases in pollutant concentrations have been experienced by “high-income” countries, while “low-income” countries have experienced the greatest and most widespread degree of water quality degradation. These patterns are particularly clear for FC, where approximately 60 % of the population in high-income countries live in grid cells displaying negative trends in FC concentrations, compared to 50 %, 25 % and 10 % in “upper-middle-income”, “lower-middle-income” and low-income countries, respectively. Furthermore, in the low-income countries, 50 % of the population lives in areas where FC concentrations have increased (on average) by > 2 % each year from 1980 to 2019.

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f11

Figure 11Average annual percentage changes in (a) total dissolved solids (TDS), (b) biological oxygen demand (BOD) and (c) fecal coliform (FC) concentrations for the period 1980–2019. Results are displayed for the proportion of population (%) inhabiting grid cells exhibiting different trends in pollutant concentrations, aggregated by geographical region (left) and economic classification (right).

Download

Lastly, we present time series of in-stream TDS, BOD and FC concentrations delineated by sector-specific contributions at three selected locations (Fig. 12) for which validation plots are also presented (Figs. S6–S9). While it is not our intention to explain the patterns in concentrations and sectoral drivers for the Mersey, Cauvery and Kiso rivers specifically, these plots are illustrative of the capabilities of DynQual. For example, these plots demonstrate the relative importance of different water use activities on in-stream concentrations dynamically, and also display changes over longer time periods. This is particularly evident in FC concentrations in the Mersey River, where decreasing loadings from the domestic and manufacturing sectors, primarily due to increases in wastewater treatment capacities, have driven an overall trend towards water quality improvements. Conversely, the manufacturing sector is simulated to have had an increasing influence on TDS concentrations in the Kiso River since  2004, replacing the irrigation sector as the dominant driver of salinity pollution.

https://gmd.copernicus.org/articles/16/4481/2023/gmd-16-4481-2023-f12

Figure 12Simulated in-stream total dissolved solids (TDS), biological oxygen demand (BOD) and fecal coliform (FC) concentrations in selected rivers, disaggregated by contributing water use sectors and including linear decadal trends.

Download

4 Discussion, conclusions and future work

To conclude, we have developed and evaluated a new global surface water quality model for simulating TDS, BOD and FC concentrations as indicators of salinity, organic and pathogen pollution, respectively. Building upon the water temperature model DynWat and utilizing approaches developed in previous water quality model efforts, the open-source code is structured in a way that allows for flexibility in both hydrological and pollutant loading inputs. Output data from DynQual has potential to inform assessments in a broad range of fields, including ecological, human health and water scarcity studies. Such work is not only relevant to the hydrological and water quality modelling communities but also has applications for the broader scientific community in addition to informing policy regarding water resources management.

DynQual is ambitious in its aim to model global surface water quality (1) using a consistent approach, (2) dynamically, (3) considering multiple water quality constituents and (4) at a high spatio-temporal (i.e. 5 arcmin and daily time step) resolution. Any model must consider the trade-offs between model complexity and availability of input datasets and data to parameterize process descriptions of the model (Weaver and Zwiers, 2000; Wen et al., 2017) and the impact of this on model scope. Being a global model, DynQual is inherently unable to accurately represent all aspects relevant to the local context. Rather, the modelling strategy is to focus on the main spatial and temporal drivers of pollution in river networks globally to facilitate first-order approximations of in-stream concentrations at high spatial (5 arcmin) and temporal (daily) resolution with global coverage. As such, DynQual allows for the investigation of research questions that only large-scale modelling efforts can address. These include, as presented in the model application section, global pollution hot- and bright-spot identification (Figs. 6–8), the relative importance of different contributing sectors to water quality status across the globe (Fig. 9), and meta-trends in surface water quality dynamics (Figs. 10–11). The dynamic nature of DynQual can also facilitate analysis of intra- and inter-annual trends in surface water quality and help to further enhance the understanding of the main drivers of pollution via (dynamic) sectoral attribution (Fig. 12). Furthermore, this approach has particular value for simulating surface water quality in ungauged catchments, and our use of globally consistent input data facilitates meaningful comparisons across different world regions. Given severe limitations in observational records of surface water quality, both in terms of spatial coverage and the number of observations per water quality monitoring station (Sect. S3.1), these are key strengths of DynQual. However, poor data availability is a severe limitation for both the development of global water quality models and their evaluation.

Uncertainties in surface water quality simulations arise from a combination of uncertainties associated with quantifications of pollutant loadings (e.g. pollutant excretion, emission rates and sector-specific return flows), the quality of hydrological simulations (e.g. discharge and velocities) and the representation of in-stream processes (e.g. decay coefficients). These uncertainties are especially prevalent when modelling at large spatial extents. In-stream pollutant concentrations are sensitive to dilution capacity and thus the quality of the hydrological simulations. This issue contributes to uncertainties in simulated concentrations particularly in headwater streams. Fixed estimates of decay coefficients are assumed, which contributes to uncertainties in simulations of reactive constituents such as BOD and FC. In addition, the representation of lakes and reservoirs in DynQual is rudimentary, with total (routed) loadings instantaneously averaged over the volume of the waterbody assuming full mixing.

With respect to pollutant loading quantifications, spatial mismatches between the generation of pollutant loadings and the location of entry to the stream network (return flows) can result in the simulation of unrealistic concentrations, particularly in grid cells with very low water availability (i.e. headwater streams). This can occur where the drivers of point source pollutant emissions (e.g. population) do not directly coincide with the location of wastewater treatment plant outlets. A lack of temporally explicit input data can hinder proper representation of sectors with strong intra- or inter-annual variability. For instance, notable limitations for the livestock sector are the simplified assumptions made for livestock population numbers (assumed to be constant across days of the year), changes to livestock numbers across multi-year periods (applied annually and based on regional averages) and transportation pathways to the stream network (assumed to be a function of surface runoff excluding the representation of processes that affect pollutant retention in soils). Locally relevant sources of pollution may also be entirely excluded, such as the lack of information on TDS emissions from mining activities and road deicing. Similarly, pulses of pollutant loadings occurring during extreme rainfall of flood events are also overlooked, such as those associated with sewer overflows or from inundated industrial areas.

Despite these uncertainties, DynQual has been demonstrated to perform with a reasonable level of performance, especially given the approximations of the model. Water temperature simulations closely match observations at daily resolution as indicated by KGE coefficients (Fig. 4a), which are high across all world regions (Fig. S3). Furthermore, time series and average annual plots (Figs. 5a, S5) demonstrate that seasonal regimes present in observed water temperatures are well captured by the model. Simulated TDS, BOD and FC concentrations are largely within the correct concentration classes (Fig. 3) with nRMSE coefficients (Fig. 4b–d) deemed reasonable considering the challenges of comparing individual (instantaneous) observed daily TDS, BOD and FC concentrations against simulated daily concentrations. Long-term time series and average annual cycle plots for TDS (Figs. 5b, S6), BOD (Figs. 5c, S7) and FC (Figs. 5d, S8) show that DynQual can generally simulate in-stream concentrations within the correct range (e.g. min–max daily concentrations, 10th and 90th percentile average annual cycles), but simulations of in-stream concentrations time series on a daily time step show relatively poor agreement with the observed time series. Observed data records also tend to display large variability in concentrations but little (systematic) seasonality, especially for BOD (Fig. S7) and FC (Fig. S8) concentrations. These factors have a strong influence on metrics including nRMSE but especially the other commonly used evaluation metrics in hydrology such as the Nash–Sutcliffe efficiency (NSE) and Kling–Gupta efficiency (KGE), and hence support our decision not to evaluate model performance using these metrics. Challenges related to the observational records themselves should also be acknowledged. These can relate to, for example, artefacts in observational records (Fig. S9a), issues related to instrument detection limits and/or reporting accuracies (Fig. S9b) and large variability in the observation records (Fig. S9c). Lastly, given the approximations of the model, the overall complexity in the drivers of pollutant loadings and input data limitations, we reiterate that the current set-up of DynQual is not suited to simulate daily TDS, BOD and FC concentrations that correspond exactly with in situ observational measurements.

With few comparable studies in the current literature, it is difficult to quantitatively assess the performance of DynQual relative to other large-scale surface water quality models. Overall, our modelled spatial patterns in surface water quality match well with previous regional and global assessments – showing multi-pollutant hotspots (e.g. TDS, BOD, FC) to be located across northern India and eastern China in particular (UNEP, 2016; Wen et al., 2017; Van Vliet et al., 2021). Consistent with a recent data-driven (machine learning) approach (Desbureaux et al., 2022), albeit for some different water quality constituents (e.g. total phosphorus), we find a general trend towards surface water quality improvement in developed countries and deterioration in developing countries. Water temperature (Tw) simulations closely match those of the global water temperature models upon which DynQual is based (Van Vliet et al., 2012b; Wanders et al., 2019; Van Vliet et al., 2021). For total dissolved solids (TDS) and biological oxygen demand (BOD) concentrations, values of (and patterns in) normalized root-mean-square errors (nRMSEs) are similar to previous work (Van Vliet et al., 2021), with reasonable model performance (< 1 nRMSE) exhibited at monitoring locations across all continents. Other large-scale surface water quality models have validated simulated concentrations with respect to concentration classes linked to sectoral water use and environmental health limits. Following this approach, Wen et al. (2017) reported BOD concentrations simulated within the same classification in 94 % of instances; however, this is based on only 760 measurements, of which 91 % are modelled in the lowest pollutant class (0–5 mg L−1). More comparable to our simulations, UNEP (2016) compared modelled and observed pollutant classes for TDS, BOD and fecal coliform (FC) concentrations across Latin America, Africa and Asia, achieving largely comparable model performance. Comparing our simulations to output from other global water quality models modelling Tw, BOD, TDS and FC, when available, will provide further insights into model performance.

Meaningful comparisons to other surface water quality models are challenging due to the high diversity in terms of (1) spatial extent (e.g. lumped vs. distributed), (2) temporal resolution (e.g. daily vs. monthly vs. annual vs. decadal), and (3) water quality constituent and reporting form (e.g. loads vs. concentrations). Similarly, watershed-scale surface water quality models are constructed for different purposes than large-scale (continental to global) surface water quality models. These watershed models can better incorporate locally relevant input data and processes, are parameterized for local conditions, and typically have data of good quality and record length for calibration and validation – which facilitates higher precision and accuracy in both hydrological and water quality simulations. However, these models are reliant upon detailed local knowledge, which is severely lacking for many (particularly ungauged) catchments worldwide (e.g. large parts of Africa).

Despite their limitations, process-based large-scale water quality models can facilitate first-order assessments of global water quality dynamics that are consistent across both space and time, such as those demonstrated in Sect. 3. Future applications of DynQual may include (1) expanding the number of modelled water quality constituents, (2) further spatio-temporal analysis of surface water quality, and (3) investigating the impact of uncertain climatic and socio-economic change on future surface water quality.

Code and data availability

DynQual v1.0 is open source and distributed under the terms of the GNU General Public License version 3, or any later version, as published by the Free Software Foundation. The full model code, configuration INI files and a user manual are provided through a GitHub repository: https://github.com/UU-Hydro/DYNQUAL (last access: 31 May 2023). The model code presented in this paper is archived at https://doi.org/10.5281/zenodo.7932317 (Jones et al., 2023).

A full set-up with all required input datasets for running DynQual for the Rhine–Meuse basin is provided as an example (https://doi.org/10.5281/zenodo.7027242, Jones, 2022). Monthly water temperature (Tw) and salinity (TDS) and organic (BOD) and pathogen (FC) concentrations are available directly via https://doi.org/10.5281/zenodo.7139222 (Jones et al., 2022b). Here, we also provide the output hydrological data (discharge and channel storage) simulated within the model run.

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/gmd-16-4481-2023-supplement.

Author contributions

The research was designed by ERJ, MFPB and MTHvV. The surface water quality model was developed by ERJ, with assistance from NW and EHS. Output data analysis and presentation of results was led by ERJ, with guidance and feedback from MFPB, NW, LPHvB and MTHvV. All authors contributed to and approved the manuscript.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Acknowledgements

We acknowledge the NWO for the grant that enabled us to use the national supercomputer Snellius (project no. EINF-3999).

Review statement

This paper was edited by Wolfgang Kurtz and reviewed by two anonymous referees.

References

Batjes, N. H.: ISRIC-WISE global data set of derived soil properties on a 0.5 by 0.5 degree grid (Version 3.0), World Soil Information, Wageningen, 24, d9eca770-29a4-4d95-bf93-f32e1ab419c3, 2005. 

Beusen, A. H. W., Dekkers, A. L. M., Bouwman, A. F., Ludwig, W., and Harrison, J.: Estimation of global river transport of sediments and associated particulate C, N, and P, Global Biogeochem. Cy., 19, GB4S05, https://doi.org/10.1029/2005gb002453, 2005. 

Beusen, A. H. W., Van Beek, L. P. H., Bouwman, A. F., Mogollón, J. M., and Middelburg, J. J.: Coupling global models for hydrology and nutrient loading to simulate nitrogen and phosphorus retention in surface water – description of IMAGE–GNM and analysis of performance, Geosci. Model Dev., 8, 4045–4067, https://doi.org/10.5194/gmd-8-4045-2015, 2015. 

Chapra, S. C., Pelletier, G. J., and Tao, H.: QUAL2K: A Modeling Framework for Simulating River and Stream Water Quality, Version 2.11: Documentation and Users Manual, Civil and Environmental Engineering Dept., Tufts University, Medford, MA, 2008. 

Cucchi, M., Weedon, G. P., Amici, A., Bellouin, N., Lange, S., Müller Schmied, H., Hersbach, H., and Buontempo, C.: WFDE5: bias-adjusted ERA5 reanalysis data for impact studies, Earth Syst. Sci. Data, 12, 2097–2120, https://doi.org/10.5194/essd-12-2097-2020, 2020. 

Damania, R., Desbureaux, S., Rodella, A.-S., Russ, J., and Zaveri, E.: Quality Unknown: The Invisible Water Crises, World Bank Group, Washington, DC, https://doi.org/10.1596/978-1-4648-1459-4, 2019. 

Desbureaux, S., Mortier, F., Zaveri, E., van Vliet, M. T. H., Russ, J., Rodella, A. S., and Damania, R.: Mapping global hotspots and trends of water quality (1992–2010): a data driven approach, Environ. Res. Lett., 17, 114048, https://doi.org/10.1088/1748-9326/ac9cf6, 2022. 

Ehalt Macedo, H., Lehner, B., Nicell, J., Grill, G., Li, J., Limtong, A., and Shakya, R.: Distribution and characteristics of wastewater treatment plants within the global river network, Earth Syst. Sci. Data, 14, 559–577, https://doi.org/10.5194/essd-14-559-2022, 2022. 

Gilbert, M., Nicolas, G., Cinardi, G., Van Boeckel, T. P., Vanwambeke, S. O., Wint, G. R. W., and Robinson, T. P.: Global distribution data for cattle, buffaloes, horses, sheep, goats, pigs, chickens and ducks in 2010, Sci. Data, 5, 180227, https://doi.org/10.1038/sdata.2018.227, 2018. 

Gudmundsson, L., Boulange, J., Do, H. X., Gosling, S. N., Grillakis, M. G., Koutroulis, A. G., Leonard, M., Liu, J., Müller Schmied, H., Papadimitriou, L., Pokhrel, Y., Seneviratne, S. I., Satoh, Y., Thiery, W., Westra, S., Zhang, X., and Zhao, F.: Globally observed trends in mean and extreme river flow attributed to climate change, Science, 371, 1159–1162, https://doi.org/10.1126/science.aba3996, 2021. 

Hofstra, N., Bouwman, A. F., Beusen, A. H. W., and Medema, G. J.: Exploring global Cryptosporidium emissions to surface water, Sci. Total Environ., 442, 10–19, https://doi.org/10.1016/j.scitotenv.2012.10.013, 2013. 

Hrachowitz, M., Savenije, H. H. G., Blöschl, G., McDonnell, J. J., Sivapalan, M., Pomeroy, J. W., Arheimer, B., Blume, T., Clark, M. P., Ehret, U., Fenicia, F., Freer, J. E., Gelfan, A., Gupta, H. V., Hughes, D. A., Hut, R. W., Montanari, A., Pande, S., Tetzlaff, D., Troch, P. A., Uhlenbrook, S., Wagener, T., Winsemius, H. C., Woods, R. A., Zehe, E., and Cudennec, C.: A decade of Predictions in Ungauged Basins (PUB) – a review, Hydrolog. Sci. J., 58, 1198-1255, https://doi.org/10.1080/02626667.2013.803183, 2013. 

Jones, E. R.: DynQual input example: Rhine basin, Zenodo [data set], https://doi.org/10.5281/zenodo.7027242, 2022. 

Jones, E. R., van Vliet, M. T. H., Qadir, M., and Bierkens, M. F. P.: Country-level and gridded estimates of wastewater production, collection, treatment and reuse, Earth Syst. Sci. Data, 13, 237–254, https://doi.org/10.5194/essd-13-237-2021, 2021. 

Jones, E. R., Bierkens, M. F. P., Wanders, N., Sutanudjaja, E. H., van Beek, L. P. H., and van Vliet, M. T. H.: Current wastewater treatment targets are insufficient to protect surface water quality, Commun. Earth Environ., 3, 221, https://doi.org/10.1038/s43247-022-00554-y, 2022a. 

Jones, E. R., Bierkens, M. F. P., Wanders, N., Sutanudjaja, E. H., van Beek, L. P. H., and van Vliet, M. T. H.: Global monthly hydrology and water quality datasets, derived from the dynamical surface water quality model (DynQual) at 10 km spatial resolution, Zenodo [data set], https://doi.org/10.5281/zenodo.7139222, 2022b. 

Jones, E. R., Bierkens. M. F. P., Wanders, N., Sutanudjaja, E. H., van Beek, L. P. H., and van Vliet, M. T. H.: UU-Hydro/DYNQUAL: DynQual v1.0, Zenodo [code], https://doi.org/10.5281/zenodo.7932317, 2023. 

Knoben, W. J. M., Freer, J. E., and Woods, R. A.: Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores, Hydrol. Earth Syst. Sci., 23, 4323–4331, https://doi.org/10.5194/hess-23-4323-2019, 2019. 

Lange, S. and Geiger, T.: ISIMIP3a population input data (1.0), SIMIP Repository [data set], https://doi.org/10.48364/ISIMIP.822480, 2020. 

Lohrmann, A., Farfan, J., Caldera, U., Lohrmann, C., and Breyer, C.: Global scenarios for significant water use reduction in thermal power plants based on cooling water demand estimation using satellite imagery, Nat. Energ., 4, 1040–1048, https://doi.org/10.1038/s41560-019-0501-4, 2019. 

Loucks, D. P. and Beek, E. V.: Water quality modeling and prediction, in: Water resource systems planning and management, Springer, 417–467, https://doi.org/10.1007/978-3-319-44234-1_10, 2017. 

Luthy, R. G., Sedlak, D. L., Plumlee, M. H., Austin, D., and Resh, V. H.: Wastewater-effluent-dominated streams as ecosystem-management tools in a drier climate, Front. Ecol. Environ., 13, 477–485, https://doi.org/10.1890/150038, 2015. 

Ozaki, N., Fukushima, T., Harasawa, H., Kojiri, T., Kawashima, K., and Ono, M.: Statistical analyses on the effects of air temperature fluctuations on river water qualities, Hydrol. Process., 17, 2837–2853, https://doi.org/10.1002/hyp.1437, 2003. 

Prüss-Ustün, A., Wolf, J., Bartram, J., Clasen, T., Cumming, O., Freeman, M. C., Gordon, B., Hunter, P. R., Medlicott, K., and Johnston, R.: Burden of disease from inadequate water, sanitation and hygiene for selected adverse health outcomes: An updated analysis with a focus on low- and middle-income countries, Int. J. Hyg. Envir. Heal., 222, 765–777, https://doi.org/10.1016/j.ijheh.2019.05.004, 2019. 

Reder, K., Flörke, M., and Alcamo, J.: Modeling historical fecal coliform loadings to large European rivers and resulting in-stream concentrations, Environ. Model. Softw., 63, 251–263, https://doi.org/10.1016/j.envsoft.2014.10.001, 2015. 

Robinson, T. P., Thornton, P. K., Franceschini, G., Kruska, R., Chiozza, F., Notenbaert, A. M. O., Cecchi, G., Herrero, M. T., Epprecht, M., and Fritz, S.: Global livestock production systems, Food and Agriculture Organization of the United Nations (FAO) and International Livestock Research Institute (ILRI), Rome, 152 pp., ISBN 978-92-5-107033-8, 2011. 

Sirota, J., Baiser, B., Gotelli, N. J., and Ellison, A. M.: Organic-matter loading determines regime shifts and alternative states in an aquatic ecosystem, P. Natl. Acad. Sci. USA, 110, 7742–7747, https://doi.org/10.1073/pnas.1221037110, 2013. 

Smucker, N. J., Beaulieu, J. J., Nietch, C. T., and Young, J. L.: Increasingly severe cyanobacterial blooms and deep water hypoxia coincide with warming water temperatures in reservoirs, Glob. Change Biol., 27, 2507–2519, https://doi.org/10.1111/gcb.15618, 2021. 

Stefan, L., Christoph, M., Stephanie, G., Marco, C., Graham, P. W., Alessandro, A., Nicolas, B., Hannes Müller, S., Hans, H., Carlo, B., and Chiara, C.: WFDE5 over land merged with ERA5 over the ocean (W5E5 v2.0), ISIMIP Repository [data set], https://doi.org/10.48364/ISIMIP.342217, 2021. 

Sutanudjaja, E. H., van Beek, R., Wanders, N., Wada, Y., Bosmans, J. H. C., Drost, N., van der Ent, R. J., de Graaf, I. E. M., Hoch, J. M., de Jong, K., Karssenberg, D., López López, P., Peßenteiner, S., Schmitz, O., Straatsma, M. W., Vannametee, E., Wisser, D., and Bierkens, M. F. P.: PCR-GLOBWB 2: a 5 arcmin global hydrological and water resources model, Geosci. Model Dev., 11, 2429–2453, https://doi.org/10.5194/gmd-11-2429-2018, 2018. 

Thomann, R. V. and Mueller, J. A.: Principles of surface water quality modeling and control, Harper & Row Publishers, ISBN-10 0060466774, 1987. 

Thorslund, J. and van Vliet, M. T. H.: A global dataset of surface water and groundwater salinity measurements from 1980–2019, Sci. Data, 7, 231, https://doi.org/10.1038/s41597-020-0562-z, 2020. 

Thorslund, J., Bierkens, M. F. P., Scaini, A., Sutanudjaja, E. H., and van Vliet, M. T. H.: Salinity impacts on irrigation water-scarcity in food bowl regions of the US and Australia, Environ. Res. Lett., 17, 084002, https://doi.org/10.1088/1748-9326/ac7df4, 2022. 

UNEP: A Snapshot of the World's Water Quality: Towards a global assessment, United Nations Environment Programme, Nairobi, Kenya, 162 pp., 2016. 

van Beek, L., Eikelboom, T., van Vliet, M., and Bierkens, M. F. P.: A physically based model of global freshwater surface temperature, Water Resour. Res., 48, W09530, https://doi.org/10.1029/2012WR011819, 2012. 

van Puijenbroek, P. J. T. M., Beusen, A. H. W., and Bouwman, A. F.: Global nitrogen and phosphorus in urban waste water based on the Shared Socio-economic pathways, J. Environ. Manage., 231, 446–456, https://doi.org/10.1016/j.jenvman.2018.10.048, 2019. 

van Vliet, M., Franssen, W., Yearsley, J., Ludwig, F., Haddeland, I., Lettenmaier, D., and Kabat, P.: Global River Discharge and Water Temperature under Climate Change, Global Environ. Chang., 23, 450–464, https://doi.org/10.1016/j.gloenvcha.2012.11.002, 2013. 

van Vliet, M., Sheffield, J., Wiberg, D., and Wood, E.: Impacts of recent drought and warm years on water resources and electricity supply worldwide, Environ. Res. Lett., 11, 124021, https://doi.org/10.1088/1748-9326/11/12/124021, 2016. 

van Vliet, M. T. H., Yearsley, J., Ludwig, F., Vögele, S., Lettenmaier, D., and Kabat, P.: Vulnerability of US and European Electricity Supply to Climate Change, Nat. Clim. Change, 2, 676–681, https://doi.org/10.1038/nclimate1546, 2012a. 

van Vliet, M. T. H., Yearsley, J. R., Franssen, W. H. P., Ludwig, F., Haddeland, I., Lettenmaier, D. P., and Kabat, P.: Coupled daily streamflow and water temperature modelling in large river basins, Hydrol. Earth Syst. Sci., 16, 4303–4321, https://doi.org/10.5194/hess-16-4303-2012, 2012b. 

van Vliet, M. T. H., Jones, E. R., Flörke, M., Franssen, W. H. P., Hanasaki, N., Wada, Y., and Yearsley, J. R.: Global water scarcity including surface water quality and expansions of clean water technologies, Environ. Res. Lett., 16, 024020, https://doi.org/10.1088/1748-9326/abbfc3, 2021. 

Velasco, J., Gutiérrez-Cánovas, C., Botella-Cruz, M., Sánchez-Fernández, D., Arribas, P., Carbonell, J. A., Millán, A., and Pallarés, S.: Effects of salinity changes on aquatic organisms in a multiple stressor context, Philos. T. Roy. Soc. B, 374, 20180011, https://doi.org/10.1098/rstb.2018.0011, 2019. 

Vigiak, O., Grizzetti, B., Udias-Moinelo, A., Zanni, M., Dorati, C., Bouraoui, F., and Pistocchi, A.: Predicting biochemical oxygen demand in European freshwater bodies, Sci. Total Environ., 666, 1089–1105, https://doi.org/10.1016/j.scitotenv.2019.02.252, 2019. 

Voß, A., Alcamo, J., Bärlund, I., Voß, F., Kynast, E., Williams, R., and Malve, O.: Continental scale modelling of in-stream river water quality: a report on methodology, test runs, and scenario application, Hydrol. Process., 26, 2370–2384, https://doi.org/10.1002/hyp.9445, 2012. 

Walton, N. R. G.: Electrical Conductivity and Total Dissolved Solids – What is Their Precise Relationship?, Desalination, 72, 275–292, https://doi.org/10.1016/0011-9164(89)80012-8, 1989. 

Wanders, N. and Wada, Y.: Human and climate impacts on the 21st century hydrological drought, J. Hydrol., 526, 208–220, https://doi.org/10.1016/j.jhydrol.2014.10.047, 2015. 

Wanders, N., van Vliet, M. T. H., Wada, Y., Bierkens, M. F. P., and van Beek, L. P. H.: High-Resolution Global Water Temperature Modeling, Water Resour. Res., 55, 2760–2778, https://doi.org/10.1029/2018WR023250, 2019. 

Weaver, A. and Zwiers, F.: Uncertainty in climate change, Nature, 407, 571–572, https://doi.org/10.1038/35036659, 2000. streptococci and Escherichia coli in fresh and dry cattle, horse, and sheep manure, Can. J. Microbiol., 51, 847–851, https://doi.org/10.1139/w05-071, 2005. 

Wen, Y., Schoups, G., and van de Giesen, N.: Organic pollution of rivers: Combined threats of urbanization, livestock farming and global climate change, Sci. Rep., 7, 43289, https://doi.org/10.1038/srep43289, 2017. 

Wilcock, B.: Assessing the Relative Importance of Faecal Pollution Sources in Rural Catchments, Environment Waikato, Environment Waikato, ISSN: 1172-4005, 2006. 

Wright, B., Stanford, B., Reinert, A., Routt, J., Khan, S., and Debroux, J.-F.: Managing water quality impacts from drought on drinking water supplies, Aqua, 63, 179, https://doi.org/10.2166/aqua.2013.123, 2014. 

WWAP: The United Nations World Water Development Report 2017, Wastewater: The Untapped Resource, Paris, UNESCO, ISBN 978-92-3-100201-4, 2017. 

Download
Short summary
DynQual is a new high-resolution global water quality model for simulating total dissolved solids, biological oxygen demand and fecal coliform as indicators of salinity, organic pollution and pathogen pollution, respectively. Output data from DynQual can supplement the observational record of water quality data, which is highly fragmented across space and time, and has the potential to inform assessments in a broad range of fields including ecological, human health and water scarcity studies.