StorAge Selection (SAS) transport theory has recently emerged as a framework for representing material transport through a control volume. It can be seen as a generalization of transit time theories and lumped-parameter models to allow for arbitrary temporal variability in the rate of material flow in and out of the control volume, and in the transport dynamics. SAS is currently the state-of-the-art approach to interpreting tracer transport. Here, we present mesas.py, a Python package implementing the SAS framework. mesas.py allows SAS functions to be specified using several built-in common distributions, as a piecewise linear cumulative distribution function (CDF), or as a weighted sum of any number of such distributions. The distribution parameters and weights used to combine them can be allowed to vary in time, permitting SAS functions of arbitrary complexity to be specified. mesas.py simulates tracer transport using a novel mass-tracking scheme and can account for first-order reactions and fractionation. We present a number of analytical solutions to the governing equations and use these to validate the code. For a benchmark problem the time-step-averaging approach of the mesas.py implementation provides a reduction in mass balance errors of up to 15 times in some cases compared with a previous implementation of SAS.

StorAge Selection (SAS) is a theoretical framework for modeling transport dynamics through spatially integrated systems (control volumes). It is applicable for any system in which it is reasonable to assume that the bulk material flowing out of the system (at rate

SAS is a generalization of the idea of a transit time distribution (TTD), which has proven useful in a wide range of disciplines, including chemical engineering

Our objective here is to provide detailed documentation of mesas.py, a Python implementation of SAS functions that is easy to use, highly flexible, and computationally accurate. This implementation is already the basis of online teaching resources

In a typical forward-modeling use-case, we wish to predict the concentration

The basic equations required to calculate

mesas.py offers an extremely flexible framework for specifying SAS functions, allowing them to be arbitrarily complex and variable in time. This includes the ability to specify SAS functions as a time-varying weighted sum of other functions

mesas.py uses a novel mass-tracking approach that estimates solute/tracer mass storage as part of the solution.

mesas.py estimates the time-step-averaged transit times and mass fluxes using a fourth-order Runge–Kutta method, and it provides superior numerical accuracy and mass balance accounting (as we shall demonstrate).

mesas.py allows for time-varying first-order reactions and time-varying solute/tracer fractionation.

mesas.py is implemented in Python and Fortran and is designed to be easy to install (through conda-forge) and user-friendly.

The governing equations of the SAS framework are given in Sect. 2 of this paper, including the novel approach to solute/tracer mass tracking. Calculating the storage and release of solutes/tracers continuously in tandem with calculation of the TTD (rather than using the convolution after the TTD has been obtained) makes incorporating reactions and fractionation into the SAS function simple and intuitive. Section 3 gives details of the code, including the numerical implementation, the method for specifying SAS functions, and procedures for running the code.

In Sect. 4 of the paper, we test the code against a number of benchmarks in the form of analytical solutions to the governing equations. These include cases of steady and unsteady flow. We compare the accuracy of mesas.py against that of tran-SAS for the unsteady-flow case.

To estimate

The bulk flow is the material that makes up the inflows and outflows from the system, carrying tracers and other species of material with it. Typically, in hydrologic applications, the bulk flow is water. As is typical in hydrology, we assume that the water is incompressible so we can refer to units of volume for convenience, but the framework is valid for any conservative bulk flow as long as fluxes, storages, and concentrations are expressed in consistent compatible units.

The conservation equation for the bulk flow can be obtained by considering an incremental volume

Integrating

Consider a conservative solute or tracer that travels ideally with the bulk material. We can define

We can easily generalize Eq. (

We can account for this fractionation in a simple way by assuming that the concentration of the solute in outflow

Second, we can account for first-order reactions. The change in

Including fractionation and the reaction terms in the solute conservation law gives

The actual outflow concentration at time

The equations above cannot be solved on their own, as

Given a volume of age-ranked storage

By taking the derivative of the equation above and applying the chain rule, we can see that

The SAS functions needed to represent a particular system are typically obtained by first choosing a functional form from those presented below and then tuning the parameters of that functional form such that the model predictions match the tracer observations. It should be noted that there is currently an element of subjectivity and imprecision here, as multiple functional forms may produce equally acceptable fits to the available data

The SAS function must be specified so that it accurately captures how a system turns over, releasing storage as bulk outflow. At present, three continuous distributions commonly used for specifying SAS functions are available as built-in options in mesas.py: the beta, Kumaraswamy, and gamma distributions. More details on each distribution are given below.

These distributions each have at least two parameters: a

Briefly, the three aforementioned built-in distributions are as follows:

In addition, any distribution specified by the scipy.stats library can be used in mesas.py by converting its CDF into a piecewise linear form (see next section). This is done automatically within mesas.py. The accuracy of the results obtained this way may be poor. The piecewise linear form may be inaccurate where the PDF changes rapidly (as a gamma distribution does near

SAS functions can also be specified as a piecewise linear CDF with

mesas.py also allows an SAS function to be specified as a (time-varying) weighted sum of component SAS functions, each of which may specified in any of the available ways. This approach was first suggested by

Given

The numerical implementation of the governing equations in mesas.py is reminiscent of a numerical finite-volume scheme. We will assume that time steps

First, age-step-averaged forms of the state variables are obtained by integrating in

The age-step-averaged TTD

Thus, we can express the governing equations as the set of ordinary differential equations (ODEs):

If the right-hand side of these equations were functions of the state variables

The numerical solution of these equations involves two core tasks:

estimating the rates of change

using these to estimate the state variables

Let us assume

By default, mesas.py uses the RK4 method to estimate the value of

The state variables are held in memory in arrays whose columns are times and whose rows are ages. Thus, stepping through time steps

mesas.py allows initial conditions to be specified for the age-ranked storage

mesas.py proceeds by solving all the time steps

Inputs to mesas.py come in two main forms:

parameters specifying the SAS function(s), solute properties, and other model settings and

time series of inflows, outflows, and other variables.

The parameters are specified using a nested data structure that can be stored and read from a JSON-formatted text file or fed into a model object instance directly as a Python dictionary. The time series can be provided as a .csv text file or as a pandas data frame.

The parameter data structure consists of a dictionary of key:value pairs, where a “key” is an immutable label (typically a string) and a “value” is an object that can be retrieved from the dictionary using the associated key. The values can themselves be dictionaries, allowing for a nested structure to the data.

The top-level dictionary in the parameter specification must have a key

A basic example of the

The

The relationship between the parameters in Eqs. (

Each of the keys naming a bulk flux in

Presently, each SAS function can be specified in three different ways:

as a gamma, beta, or Kumaraswamy distribution;

using any distribution from scipy.stats;

as a piecewise linear CDF.

The gamma, beta, or Kumaraswamy distributions are coded into the core computational code, whereas the scipy.stats distributions are approximated as piecewise linear CDFs. In either case, the distribution is selected based on the value associated with

The distribution parameters are given by the dictionary associated with the key

To use a distribution from scipy.stats the key:value pair

Description of the keys that may optionally be associated with each solute in the

Alternatively, the SAS function can be specified as a piecewise linear CDF. In the example above, this option is used to specify a uniform SAS function for ET using a single linear segment. The cumulative age-ranked storage values

Solute properties are given in a dictionary associated with the top-level key

In this case, mesas.py will look for a time series of solute inflows in column

Description of the keys that may optionally be in the

Additional options can also be set in the

The time series input can be provided as a .csv file or as a Python pandas data frame. The order of the columns is not important, but the column names should be consistent with references to time series data in the SAS function specification, solute parameters, and options.

For example, to be consistent with the specifications given in the example in Sect.

After running, the output time series would include the following new columns:

Results of the benchmark runs under steady flow. Panels

The model is set up and run by instantiating a model object provided with all the needed input data and then calling its

The time series inputs and outputs will then be available in a data frame accessible as an attribute of the model object. For example, this would allow the user to plot the input and output concentrations.

Users can access further results by employing accessor functions. These can return the values for a particular time step, age step, or input time. The latter is useful for examining how water that entered at a particular time evolves in time. If none of these are given, the entire array is returned. Both density (

More information on these functions is available in the documentation (see the “Code and data availability” section of this paper).

To validate the numerical implementation, mesas.py was tested against several analytical benchmark solutions. Six of these are analytical solutions for different SAS functions under steady-flow conditions. Additional benchmark solutions for unsteady flow are identical to those presented for tran-SAS in

Analytic solutions for the continuous and discrete TTD for a number of benchmark SAS functions with uniform, gamma, and beta distributions. The discrete form is obtained by averaging the value of

For certain SAS functions, it is possible to find a closed-form expression for the corresponding TTD under steady flow. For the six cases considered here, the details of the derivations are given in Appendix A and the mathematical results are listed in Table

The six cases (also shown in the top row of Fig.

To assess the validity of our implementation of the numerical solution against these closed-form expressions, we can either (a) use very fine time steps, and thus more closely approximate the continuous result, or (b) find an analytical form of the discrete solution. The latter is preferable, as we can compare the numerical and analytical results directly, rather than asymptotically at the limit of small time steps. We have, therefore, taken the additional step of obtaining discrete versions of each expression (rightmost column of Table

In each scenario, the flow rate was set to

Figure

Comparison of mesas.py and tran-SAS for the case with uniform sampling and

When

In the partial piston case, errors are initially very low but suddenly become much larger later in the simulation. This occurs when the age-ranked storage of the oldest water in storage (that which entered in the first time step) reaches the “spike” on the right-hand end of the distribution. That is, using the definitions in Table

The power of the SAS approach comes from its ability to handle time-variable inflows and outflows. The bottom row of Table

This benchmark was used to validate the mesas.py code for the same dataset that

Figure

Figure

Variations of mesas.py applied to the time series and SAS model from

Comparison of mesas.py with tran-SAS for a benchmark problem presented by

Closer inspection of the residuals between the model concentration predictions and the analytical benchmarks reveals differences between the performance of tran-SAS and mesas.py. Figure

These differences become larger when we consider the error in the solute mass flux, as shown in Fig.

The differences can be almost entirely attributed to the fact that tran-SAS provides estimates of the

mesas.py performs better than tran-SAS for other configurations of the problem, although the size of the difference changes. The normalized RMSEs of each implementation are shown in Fig.

We can also compare the performance of these models for a case in which the discharge SAS function is not uniform. Again, following

The time series dataset includes columns

As analytical solutions are unavailable for this more general case, the results obtained from tran-SAS and mesas.py were compared against a higher-accuracy mesas.py solution (obtained by setting

Comparison of the computational time and time series length for three mesas.py and two tran-SAS numerical computation settings. The plot shows the mean computational time and 95 % confidence interval for each case, represented by a line and corresponding colored region, respectively. The mesas.py cases include the EF (forward Euler), midpoint (midpoint method), and RK4 (fourth-order Runge–Kutta) methods, whereas the tran-SAS cases include the EF* (modified forward Euler) and ODE (MATLAB ODE solver) methods.

Computation time was compared by running all models for a benchmark configuration (with various lengths of time series) on a single-core Intel Xeon Gold Cascade Lake 6248R CPU with 192 GB DDR4 2933 MHz RAM. The benchmark configuration is the uniform sampling discharge SAS function case described above. Each configuration was run 100 times to establish confidence bounds on the run time.

As the results in Fig.

SAS transport theory provides a very general framework for modeling material transport through control volumes

The code is also intended to be user-friendly. A number of resources are available, including a free online course via HydroLearn. The course, entitled “JHU 570.412 Tracers and transit times in time-variable hydrologic systems: A gentle introduction to the StorAge Selection (SAS) approach” can be found at

Further work is needed to augment the code with additional useful tools. Three sets of tools are particularly important. First, tools for generating ensembles of input concentration data. In hydrology, observations of input concentrations are often bulk samples that represent amount-weighted averages over multiple time steps. These must be disaggregated to the resolution at which we want to run the model. Second, tools for parameterizing SAS functions and fitting them to data, preferably in a way that can adapt to any specification of the SAS function. Third, tools for assessing uncertainty in both the disaggregated inputs, the SAS function shape, and the model predictions.

At steady state with one outflow and

Alternatively, we can use the fact that

For example, for a uniform distribution

A similar set of steps can be used to show that an exponential SAS function (which is a special case of a gamma distribution with shape parameter

To obtain a discrete form of the analytical solution, we can make two assumptions. First, that the time series of inflow concentrations and of water inflows and outflows are in fact constant within a time step

Second, we assume that the numerical estimates of the outflow concentration time series should reflect the average value of the analytical solution over each time step. That is,

In the continuous form,

To obtain the discrete outflow concentrations, we must apply the time step averaging in Eq. (

For the elementary case of steady flow and uniform sampling, this gives

mesas.py v1.0 is open source and distributed under the terms of the MIT License. The code is available on Zenodo (

The most up-to-date version of mesas.py along with its dependencies can be installed into a new environment from the command line using conda with the following command:

The model can also be installed by building from the source code (see the documentation for steps required). A FORTRAN compiler is required to do so (but is not required when installing through conda). Documentation for mesas.py is available at

CJH contributed to the conceptualization, methodology, formal analysis, investigation, software development, validation and evaluation, visualization, original draft preparation, funding acquisition, project administration, and supervision. EXF contributed to the software development, validation and evaluation, visualization, original draft preparation, and review and editing.

The contact author has declared that neither of the authors has any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

The authors are grateful to Eric Hutter, Oliver Evans and Fei Lu for their contributions to the mesas.py code. We would also like to thank Paolo Benettin and the anonymous reviewer for their helpful reviews.

This work was supported by the US National Science Foundation Directorate of Geosciences (grant no. EAR-1654194).

This paper was edited by Carlos Sierra and reviewed by Paolo Benettin and one anonymous referee.