Oceanic dissolved inorganic carbon (

The ocean absorbs about a quarter of the anthropogenic carbon dioxide (

Exchange of

Many research questions require solving the marine carbonate system from some measured or modelled pair of its parameters. Several software tools have been developed for this purpose such that most scientific software environments and programming languages have a widely accepted marine carbonate system solver

As its name suggests, PyCO2SYS originates from the existing CO2SYS family of software. The original CO2SYS program for MS-DOS

As the original CO2SYS software is so well-established in the research field, we provide a relatively brief summary of the components of PyCO2SYS that are identical to CO2SYS-MATLAB in Sect.

The components of PyCO2SYS that have been inherited directly from CO2SYS-MATLAB v2.0.5

The abundances of all solutes and total alkalinity provided as arguments to PyCO2SYS or returned from it as results are in units of micromoles per kilogram (

Temperature is in degrees Celsius (

Pressure is in decibars (dbar) and represents the hydrostatic pressure exerted by the overlying water column, consistent with typical oceanographic conductivity–temperature–depth (CTD) measurement reporting. Atmospheric pressure is not included, so pressure is effectively zero in the laboratory and at the sea surface.

The pH can be provided on the free, total, seawater, and/or NBS (now NIST) scale, with

A notable feature of all CO2SYS software is the variety of different parameterisation options to calculate the various equilibrium constants and some components' total contents from salinity, temperature, and pressure. Which parameterisations the user selects can appreciably alter the results, so these choices should always be explicitly reported.

Parameterisations of the dissociation constants of carbonic acid available in PyCO2SYS and corresponding implicit settings (Table

Parameterisations that vary depending on the case of the selected carbonic acid constants (Table

Some of these options also influence other, seemingly unrelated, parameters of other chemical systems. This is not widely appreciated because this happens internally, hidden within the code. The most influential choice is for the carbonic acid dissociation constants,

Parameterisations that (except where noted) are not influenced by the case of the selected carbonic acid constants (Table

Other internal settings are consistent across all cases (Table

In addition to the carbonic acid equilibria, the user has multiple parameterisation options for each of the following: (i) the ratio between total borate and salinity, (ii) the bisulfate dissociation constant

Equilibrium constants in PyCO2SYS are all stoichiometric rather than thermodynamic and thus denoted with

calculated on the pH scale reported in the literature as a function of temperature and salinity at zero in-water pressure;

converted to the seawater pH scale (Appendix

corrected to the in situ pressure;

converted to the pH scale indicated by the user's input (Appendix

There are some exceptions to the evaluation steps listed above. First, the pH scale conversions (steps 2 and 4) are not applied to

In PyCO2SYS, users can also specify their own values for any or all of the equilibrium constants or total salt contents. Any values specified in this way are used as-is throughout PyCO2SYS: no pH scale or pressure corrections are applied, so it is left to the user to ensure that the values are provided on the appropriate pH scale and at the relevant temperature and pressure.

A useful feature of all CO2SYS software that can nonetheless cause confusion is calculations at “input” and “output” conditions; “conditions” refers to temperature and pressure. There is an unhelpful overlap of nomenclature, with input and output used firstly in a programming context to refer to arguments that are passed into functions and returned from them as results, and secondly in a measurement context in which they refer to the temperatures and pressures under which the known parameter pairs are provided and at which results are to be calculated. For clarity, we therefore use the terms “arguments” and “results” in the programming context, while input and output always refer to the measurement context. Thus, we provide values at both input and output conditions as arguments to PyCO2SYS and we receive calculations at both input and output conditions as results from the program.

Input and output conditions are used when measurements were conducted at a different temperature and/or pressure from what the sample would experience in situ or to evaluate the effect of changing these conditions on the solution chemistry. All core carbonate system parameters except for

If calculations are conducted using only in situ values, for example from model output or with the temperature and pressure corrections already applied, then output-condition arguments need not be supplied. Results are then calculated only under the input conditions for computational efficiency.

We refer to the parameters from which PyCO2SYS can solve the marine carbonate system as the “core” marine carbonate system parameters. These are

Overview of the process by which PyCO2SYS and other CO2SYS implementations solve the marine carbonate system (MCS) and calculate other results. Arguments provided by the user are shown as open symbols on a yellow background, while calculations and results use filled symbols. Components under input conditions are shown in light blue, those under output conditions are in red towards the right, and components that are independent of input/output conditions are in dark blue. Any pair of the parameters in the “MCS arguments” box at the top left can be provided, noting that only one of

To calculate its results (Fig.

Other properties of interest are subsequently calculated from whichever core parameters are most convenient under both input and (if provided) output conditions. These properties include all the individual components of alkalinity (Appendix

Solving the alkalinity–pH equation is a critical component of marine carbonate system modelling. Like other implementations of CO2SYS, PyCO2SYS uses the Newton–Raphson method. The general equation is

Unlike other implementations of CO2SYS, the equations that determine the relative abundances of different chemical species as functions of pH and their total contents (Appendix

The derivative term in Eq. (

Through our approach, the effect of every component of alkalinity in the main chemical speciation equation is included in the derivative term in Eq. (

Automatic differentiation is also used to evaluate chemical buffer factors, again ensuring that the influence of every modelled equilibrium system is accurately included. The calculated buffer factors are described in more detail in Sect.

A further advantage of the automatic differentiation approach is that if the main chemical speciation function is modified in the future, for example to include additional components of alkalinity, then these changes are automatically incorporated into all the alkalinity–pH solvers without needing to modify the various solver functions. In short, our approach ensures that PyCO2SYS calculations will remain internally consistent and reflect the influence of every solute and equilibrium modelled in the main chemical speciation function, even if this function is modified in the course of future development (Sect.

PyCO2SYS adjusts how to determine when the alkalinity–pH solver should stop solving for vectorised arguments. In CO2SYS-MATLAB v2.0.5, the solvers continue to iterate and update all values until the change in every element of the array satisfies the

The maximum solver jump – which constrains the greatest change in pH possible between solver iterations, thus helping to prevent overshoot – is implemented slightly differently in PyCO2SYS than in other CO2SYS programs. In CO2SYS-MATLAB, any

PyCO2SYS fixes a simplification in earlier CO2SYS implementations regarding how pH scales are converted within the main chemical speciation function. This simplification is noted in the programmer's comments in the relevant CO2SYS-MATLAB functions, carried through from the original MS-DOS implementation

In short, pH and the equilibrium constants are provided to these functions on the same pH scale as each other – except for

This simplification makes a negligible difference to calculations at typical seawater pH (because [

Like most iterative solvers, the Newton–Raphson method (Sect.

Following

For clarity in the equations in this section, we abbreviate

First,

Following a scheme equivalent to

For clarity in the equations in this section, we abbreviate [

Carbonate–borate alkalinity as a function of

The initial

For clarity in the equations in this section, we abbreviate [

Carbonate–borate alkalinity as a function of

The initial

The contributions of ammonia and bisulfide to alkalinity

The total substance contents and stoichiometric dissociation constants for up to two additional acid–base systems that contribute to total alkalinity can be provided as arguments to PyCO2SYS and are part of its speciation model. The effects of these extra components are automatically incorporated into all PyCO2SYS calculations, including the iterative pH solvers (Sect.

Previous versions of CO2SYS used an old value for the universal gas constant (

Like CO2SYS-MATLAB v3.2.0, PyCO2SYS calculates the “substrate : inhibitor ratio” of

A buffer factor quantifies the sensitivity of a certain marine carbonate system parameter to a change in another parameter. Best known is the Revelle factor, which is the ratio of the fractional change in

Closely related to these buffer factors,

PyCO2SYS offers two independent ways to evaluate the various buffer factors of the marine carbonate system: with explicit equations and by automatic differentiation. The latter is used by default.

The “explicit” approach follows equations reported in the literature

The “automatic” approach uses automatic differentiation to find the derivative necessary to evaluate each buffer factor. The appropriate derivatives are taken from the functions that calculate a third carbonate system parameter from a known pair (Appendix

Of the buffer factors, only the Revelle factor was included in previous versions of CO2SYS. It was evaluated using finite-central-difference derivatives, which is replicated as the explicit option in PyCO2SYS (with the corrections described in Appendix

For conversions between

Atmospheric pressure can have a non-negligible effect on calculations in some regions: for example, over much of the Southern Ocean, atmospheric pressure is typically 3 % lower than the global mean, corresponding to a 10

This optional argument is only intended for modelling the effects of variations in atmospheric pressure on samples from the surface ocean or in the laboratory. It is not suitable for determining interior ocean

As well as solving from a pair of parameters, PyCO2SYS can be run with one or no marine carbonate system parameter arguments.

If no parameters are provided, then PyCO2SYS returns all the equilibrium constants and total salt contents that are calculated from temperature, pressure, and salinity (Sect.

If one parameter is provided, then the results that can be computed with that parameter alone are returned. This applies to pH,

The pH can be converted between the different scales without knowledge of a second carbonate system parameter. Therefore, if pH alone is provided to PyCO2SYS, it is converted to every pH scale under the input conditions (Appendix

Seawater

All arguments to PyCO2SYS, including settings, can be multidimensional. A combination of scalar and multidimensional arguments can be provided, with the latter formatted as NumPy

Schematic representation of broadcasting array shapes with NumPy in PyCO2SYS.

Propagating the uncertainty in an argument through to a result requires knowing the derivative of the result with respect to the argument. Uncertainty propagation is available for a subset of the arguments in the original MS-DOS CO2SYS

PyCO2SYS evaluates the derivatives using a finite-forward-difference approach. We use finite differences rather than automatic differentiation here because the latter, while possible, is computationally inefficient to apply over the entire PyCO2SYS program. We use forward- rather than central-difference derivatives because the former can be safely evaluated at zero for variables for which negative values are impossible (e.g. salinity). The derivative of a result

An example figure used to select a suitable

PyCO2SYS can conveniently obtain derivatives of all its results with respect to all of its arguments and also with respect to all parameters that are normally calculated internally from temperature, pressure, and/or salinity, such as equilibrium constants and total salt contents.

The derivatives are calculated by a function that wraps the entire PyCO2SYS program, rather than by adding extra internal variables that keep track of the effects of differences in to the arguments, as has been implemented elsewhere

To determine the overall uncertainty in each result, the uncertainty components from different arguments are combined using

There are no “certified” results of marine carbonate system calculations against which software like PyCO2SYS can be validated. But we can test its internal consistency, and we can compare its results with the calculations of other programs and values reported in the literature.

PyCO2SYS is developed and hosted on GitHub (

The status badge for the validation tests, which are publicly visible at PyCO2SYS's GitHub repository (

For all versions of PyCO2SYS up to v1.8.0, the test suite runs on Python v3.7, 3.8, and 3.9. Other versions of Python may also work but are untested.

In a “round-robin” test, we first determine all of the core carbonate system parameters from one pair and then solve the system again using every possible pair of determined parameters. Under typical seawater conditions, we find the same results for every parameter pair to within better than the tolerance of the iterative pH solvers (i.e.

Results of an example round-robin test with PyCO2SYS with default parameterisation options. Other conditions: salinity 33, temperature 22

If we include only the solution components that appear in the explicit equations for the buffer factors (i.e. zero nutrients and total salts, except for

Typically, one would not set the total salt contents to zero when computing buffer factors with the default automatic approach. As a consequence, differences between the explicit and automatic buffer factors may be larger than described above but still practically negligible: keeping nutrients at zero but using

The propagation of independent uncertainties using forward-difference derivatives (Sect.

We used CO2SYS-MATLAB v2.0.5

However, these CO2SYS-MATLAB versions do not permit solving with either carbonate or bicarbonate ion content as a known parameter, nor do they include ammonia or sulfide speciation. They also lack the parameterisations of

All equilibrium constants and total salt contents, calculated from salinity, temperature, and pressure, are virtually identical (absolute tolerance

If PyCO2SYS is adjusted to match CO2SYS-MATLAB v2.0.5, i.e. if the following points are true, then the differences between PyCO2SYS and CO2SYS-MATLAB calculations are virtually zero (no greater than

approximate slopes are used for the pH solvers, including only carbonate–borate–water alkalinity, instead of using automatic differentiation to determine these exactly (Sect.

pH solver tolerance is set to

the original approach to prevent overshoot from solver jumps in pH that are too great is used (Sect.

the iterative pH solver continues updating all elements until all pH changes fall beneath the tolerance threshold (Sect.

the pH scale conversion simplification is reinstated (Sect.

initial pH guesses are always set to 8 instead of using our extended

If the adjustments above, other than fixing the pH scale conversion simplification, are not made, then the differences between PyCO2SYS and CO2SYS-MATLAB v2.0.5 are up to the order of

Fixing the pH scale conversion simplification too (Sect.

Repeating the exercise above for CO2SYS-MATLAB v3.2.0 has similar results, with differences negligible for all practical purposes. Only adjustments 1, 2, and 3 from the list above need to be made to PyCO2SYS in this case. With PyCO2SYS fully adjusted to match CO2SYS-MATLAB v3.2.0, differences in calculated values are still mostly less than

PyCO2SYS reproduces all the derivatives reported by

Across all combinations of optional parameters, mean uncertainties in

PyCO2SYS can be used to reproduce the closed-cell seawater titration datasets simulated by

The first titration dataset, without phosphate, is reproduced perfectly by PyCO2SYS to the number of decimal places reported by

0.45 g: pH either 6.5

0.60 g: pH either 6.366

1.25 g: pH either 5.54995

The other 48 data points in this titration agree perfectly. The noted discrepancies occur in non-consecutive data points and are therefore unlikely to all be associated with an error in a particular equilibrium. Coupled with the nature of the differences (underlined above), which is one or two specific digits switched or replaced rather than the entire number being different, we conclude that these differences most likely represent minor typographical errors and therefore that PyCO2SYS does accurately reproduce these simulations in full.

The aim of our revised scheme for initial pH estimates, following

Initial estimates (solid lines) and final solutions (dashed lines) of pH from known parameter pairs of total alkalinity (2.3 mmol kg

We find that the initial pH estimates determined according to the scheme described in Sect.

It is not strictly true that the marine carbonate system can always be solved from any pair of its parameters. Some combinations have multiple solutions. For example, both the

The iterative

Residuals between known

For the

Main components of

Which root the solver finds depends on the initial pH estimate and the residual alkalinity–pH slope at that point (Eq.

In typical open-ocean work this is largely academic: the true pH is typically around 8 and the higher root greater than 10, so a constant initial pH estimate of 8 would also return the correct root. But in more unusual environments, the new algorithm introduced here could help ensure that the solver identifies the correct root. It is possible for the user to specify a different initial pH estimate to control which root PyCO2SYS obtains (as we did to create Fig.

As noted previously

Solving from

Main components of

In PyCO2SYS,

Although a pressure correction for

However, recent developments in sensor technology are beginning to enable direct measurements of in situ

One does not choose to write code in Python for its computational speed. Therefore, while optimising performance was not ignored in developing PyCO2SYS, it was not a main focus. We compared the computational speed of PyCO2SYS against that of CO2SYS-MATLAB v3.2.0 across a few different tasks for reference purposes. We ran CO2SYS-MATLAB in both MATLAB itself (expensive, proprietary software) and GNU Octave, a free and open-source MATLAB clone.

Comparison of computational speed for various tasks with PyCO2SYS and CO2SYS-MATLAB running in both MATLAB and GNU Octave. Values shown are the mean

The different tasks are described in the subsequent sections, and the results are summarised in Table

Overall, the PyCO2SYS computation time has the same order of magnitude as CO2SYS-MATLAB, but it is generally somewhat slower. However, the difference is negligible in practice for relatively small datasets (up to about 10

The “all combinations” task was the validation test described in Sect.

CO2SYS-MATLAB completed this task in a very similar time in both MATLAB and GNU Octave, with the latter slightly faster, and PyCO2SYS took about 1.5 times longer (Table

In this task, (Py)CO2SYS was run across the entire GLODAPv2.2021 Merged Master File

This calculation is an example in which results would only be required under one set of temperature and pressure conditions rather than needing to evaluate both input and output conditions. This allows PyCO2SYS to be used more efficiently, as it only calculates output-condition results if they are explicitly requested (Sect.

The results in Table

The Autograd package that PyCO2SYS uses for automatic differentiation is still being maintained, and its most recent release (v1.3, July 2019) is stable, but it is no longer in active development. Its successor, JAX

As future developments are made to PyCO2SYS, we will aim to maintain consistency with other CO2SYS-family tools but cannot guarantee that all new features or updates will be added simultaneously across all implementations. In practice, the workload required to achieve this is not currently feasible, and we would not wish to hold back development because of the time required to replicate changes across multiple implementations. That said, the results should remain consistent enough that users can select which implementation to use based on their preferred software environment rather than the other way around.

This ambition could also extend beyond the CO2SYS family of software. Independently developed tools for solving the marine carbonate system exist in other languages, such as seacarb in R

Thanks largely to the efforts of

As development of PyCO2SYS continues, we do not anticipate changing its fundamental approach to solving the marine carbonate system, but we will try to incorporate the latest research, including keeping up to date with new parameterisations, for example of stoichiometric equilibrium constants

Through all these efforts, we aim to ensure that PyCO2SYS remains a reliable and comprehensive tool for analysing seawater chemistry from samples and experiments in the laboratory through to the changing marine carbonate system across the global ocean.

The pH scales in PyCO2SYS are free (pH

Each equation here is written assuming that

Total alkalinity (

Undissociated

Further deprotonation of

Further deprotonation of

Undissociated

The reactions and equations for the second additional component

Though the definition of alkalinity

Here, we lay out all the equations that are used to convert between different carbonate system parameters in PyCO2SYS. These follow long-established approaches from the literature

As the stoichiometric equilibrium constants are converted to the user-specified pH scale, i.e. consistent with the pH values, pH and

If one of

For known

For known

For known

The calculation steps given below for

An initial pH estimate is determined as described in Appendix

The components of

First, we determine

There is an upper limit on pH for each given

An initial pH estimate is determined as described in Appendix

An initial pH estimate is determined as described in Appendix

An initial pH estimate is determined as described in Appendix

First,

First, pH is calculated from

First, pH is calculated from

First, pH is calculated from

First,

First,

First,

First, pH is calculated from

First,

The pH is then calculated from

First,

The pH is then calculated from

Calcite and aragonite saturation states (

The “substrate : inhibitor ratio” of

To evaluate the buffer factors of

For the saturation-state buffers

The approach taken here avoids AD evaluations over the iterative solvers because, while possible, that is computationally slower than over non-iterative functions.

The Revelle factor

To evaluate the isocapnic quotient (

Finally, the “released

For clarity in the equations in this section, we abbreviate

Following

The initial

Older versions of CO2SYS-MATLAB, including v2.0.5

Rather than being corrected explicitly in PyCO2SYS, these errors are corrected automatically thanks to the approach of using automatic differentiation instead of finite-difference derivatives. The key errors in the original CO2SYS-MATLAB implementation of the finite-difference approach are the following.

An incorrect reference

Under output conditions, the Peng correction is not included in the evaluation of the Revelle factor (Sect.

The lower accuracy of the finite-difference method relative to automatic differentiation, particularly given the relatively large

Fixed

The computational speed tests described in Sect.

The Python tests were run using Python v3.9.7, Autograd v1.3, NumPy v1.21.2, and PyCO2SYS v1.8.0.

The MATLAB tests were run using MATLAB R2019b (Update 9) and CO2SYS-MATLAB v3.2.0.

The GNU Octave tests were run using GNU Octave v6.3.0 via its command-line interface and CO2SYS-MATLAB v3.2.0.

The current version of PyCO2SYS is freely available from its GitHub repository at

No data sets were used in this article.

MPH was responsible for conceptualisation, methodology, software, validation, writing the original draft, and visualisation. ERL was responsible for software and writing (review and editing). JDS was responsible for software, validation, and writing (review and editing). DP was responsible for software and writing (review and editing).

The contact author has declared that neither they nor their co-authors have any competing interests.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We thank Doug Wallace for providing useful comments on this paper, and we acknowledge his important role in the creation of the original CO2SYS software. We further acknowledge the developers of all subsequent versions of CO2SYS upon whose work PyCO2SYS was built. We thank Luke Gregor, Daniel Sandborn, and Abigail Schiller for code contributions including extending the range of data types with which PyCO2SYS can be used. We are grateful to Guy Munhoven and James Orr for their detailed and constructive reviews.

This paper was edited by Paul Halloran and reviewed by James Orr and Guy Munhoven.