Biogeochemical (BGC) models are widely used in ocean simulations for a range of applications but typically include parameters that are determined based on a combination of empiricism and convention. Here, we describe and demonstrate an optimization-based parameter estimation method for high-dimensional (in parameter space) BGC ocean models. Our computationally efficient method combines the respective benefits of global and local optimization techniques and enables simultaneous parameter estimation at multiple ocean locations using multiple state variables. We demonstrate the method for a 17-state-variable BGC model with 51 uncertain parameters, where a one-dimensional (in space) physical model is used to represent vertical mixing. We perform a twin-simulation experiment to test the accuracy of the method in recovering known parameters. We then use the method to simultaneously match multi-variable observational data collected at sites in the subtropical North Atlantic and Pacific. We examine the effects of different objective functions, sometimes referred to as cost functions, which quantify the disagreement between model and observational data. We further examine increasing levels of data sparsity and the choice of state variables used during the optimization. We end with a discussion of how the method can be applied to other BGC models, ocean locations, and mixing representations.

Biogeochemical (BGC) models used in global and regional ocean simulations often contain tens to hundreds of uncertain parameters (e.g.,

While not the focus of the present study, we face other challenges when attempting to calibrate BGC model parameters including limitations on available data and parameter dependencies. Due to the size of the ocean, the vast range of relevant temporal and spatial scales, and the complexity of the marine ecosystem, in situ observations taken tend to be sparse and often do not include all quantities present in BGC models. In some situations there is a drastic difference in the number of observed quantities and model state variables. With limited data it can be difficult to constrain parameters related to the state values not observed. The issue of data sparsity can increase the difficulty of handling parameter dependence. The problem is data often will not constrain the individual processes the BGC model is parameterizing. Instead, observations are for particular quantities of interest. For example, observing concentrations of phytoplankton chlorophyll alone could make it difficult to accurately estimate growth and mortality rates since both can be changed (decreased and increased, respectively) to try to produce a similar effect on the different plankton populations. Similarly, both increased growth rates and lower remineralization rates could have the same effect on nutrient concentrations.

In the present study, we address the challenge of calibrating a large set of parameters for a coupled biophysical model by describing and demonstrating a computationally efficient ocean BGC parameter estimation method that takes into account multiple sites and multiple variables. We perform an initial global search of the model parameter space to determine appropriate starting points for subsequent gradient-based local optimizations. The parameter values giving the best locally optimized solution are then taken as the final parameters. We demonstrate the approach by simultaneously optimizing 51 uncertain parameters in a 17-state-variable BGC model at sites in the subtropical North Atlantic and Pacific. To calibrate the BGC model, we couple it to a one-dimensional (1D) vertical ocean mixing model and match several observational fields at each site. We verify the accuracy of the method using a twin-simulation experiment (TSE), where we estimate known model parameters from synthetic data generated by a reference model simulation. Subsequent to verification using the TSE, we use the method to estimate parameters for the two sites individually and together using real-world observational data.

The present study extends prior efforts to use optimization methods in BGC model parameter estimation.

In general, these and other studies have found that local and gradient-based methods fare poorly in the optimization of BGC models.

Despite their successes, however, global and gradient-free methods can be prohibitively computationally expensive when estimating many uncertain parameters in BGC models coupled to physics-based representations of ocean mixing across multiple ocean locations. In some studies, the number of parameters estimated has been reduced to control the computational cost. For example,

Further attempts to overcome the computational cost of BGC parameter estimation include the use of physics-based surrogate models that represent realistic ocean mixing at a substantially reduced cost compared to 3D time-resolved simulations. For example,

Ultimately, while previous attempts have been made to calibrate large BGC models, to simultaneously represent multiple sites, and to use physics-based surrogate models, the present study is the first where these three challenges are addressed simultaneously. In particular, we outline a framework for estimating many uncertain parameters in a complex BGC model across a range of ocean conditions, using a physics-based model for vertical ocean mixing in a 1D water column configuration. We demonstrate the method for the 17-state-variable BGC flux model presented in

This paper provides a description of the proposed methodology for estimating model parameters in Sect.

In the present study, we treat BGC model parameter estimation as an optimization problem, where we seek to minimize the error between observational data and the coupled BGC–vertical mixing model. We thus define a generic objective function,

There are many possible ways to define the error function

Given the objective function in Eq. (

We use the open-source numerical analysis library DAKOTA

Schematic showing the coupling between DAKOTA, BFM17, and POM1D. The schematic shows observational data from BATS and the Bermuda test bed mooring, although data from HOTS are also used in the present study. The solid lines show the flow of information including parameter values and data. The dotted gray lines show how the different model components are run. DAKOTA calls the interface, which evaluates the model then calculates the current value of the objective function. The coupled biophysical model, BFM17

Figure

Leveraging the flexibility inherent in DAKOTA, we perform the parameter estimation using a hybrid optimization approach that incorporates both global (i.e., gradient-free) and local (i.e., gradient-based) methods. This hybrid approach is necessary to estimate a large number of uncertain parameters in complex BGC models while minimizing the required number of simulations, which can become expensive when the BGC model is coupled to a single- or multi-dimensional physical model and applied to various ocean locations. In total, there are three distinct steps in the present approach:

We randomly sample the parameter space

We sort the

We compare the final objective function values after gradient-based optimization for the

To implement the first step of the method in DAKOTA, we used the Latin hypercube sampling algorithm to perform an efficient global search. For the gradient-based optimization in step two, a range of possible methods is available in DAKOTA. After testing various such methods, including the conjugate gradient method, we chose the quasi-Newton (QN) optimization algorithm included in the Opt++ library within DAKOTA. This is a C++ class library that uses object-oriented programming for nonlinear optimization

Finally, we also explored the use of a genetic algorithm in the present parameter estimation method. However, for the 51-parameter BGC model that is the focus of the present demonstration, we estimated that a large and computationally expensive minimum population size of 366 members would be necessary. Because genetic algorithms cannot generally be parallelized to the same extent as Latin hypercube sampling, we were able to realize greater computational efficiency with the present approach. However, genetic algorithms do hold promise and could be incorporated with the present method in future work.

We demonstrate the parameter estimation method described in the previous section with a BGC flux model that has 17 state variables and 51 free parameters, referred to as BFM17

Figure

Both BFM17 and its larger precursor BFM56 use a chemical functional family (CFF) approach to model the marine ecosystem

Schematic of the 17 variables included in BFM17 as well as the interactions between the variables (indicated by red text). The schematic is taken from

As described in more detail in

Figure

The physical mixing model, POM1D

We configured the coupled BFM17

The observational state variables,

The mid-Atlantic implementation of BFM17

Phytoplankton chlorophyll (

Observations collected at BATS reveal substantial seasonal variability in chlorophyll, while dissolved oxygen, nitrate, phosphate, and particulate organic nitrogen exhibit relatively uniform concentrations all year round (Fig.

Figure

In summary, while the general bloom dynamics at BATS are captured by the baseline implementation of BFM17 from

A contrasting subtropical Pacific site was implemented in BFM17

Phytoplankton chlorophyll (

Observations collected at the HOTS location show fairly uniform chlorophyll, nitrate, phosphate, and PON concentrations across the seasonal cycle (Fig.

The baseline model predicts similar temporally uniform distributions of chlorophyll, nitrate, phosphate, and PON as the observations (Fig.

In summary, Fig.

To verify the effectiveness of the parameter estimation method in reproducing known parameter values, we perform a TSE using model-generated fields from BFM17

Figure

Results of the 51-parameter TSE and a single-perturbation sensitivity analysis. The TSE results show

To assess why certain parameters were not fully recovered, we performed a sensitivity analysis. For this analysis, we ran the coupled biophysical model with each parameter perturbed

Results of the 51-parameter TSE and a single-perturbation sensitivity analysis. The TSE results show

These results indicate that the parameter estimation method was successful in recovering the most sensitive parameters, while the least-sensitive parameters were not fully recovered. The optimizer and model correctly interface, and the optimization method performs as expected. With confidence in the optimizer and the interface, we performed an additional TSE that more closely mimics the calibration studies that will be performed for the BATS and HOTS locations. In this TSE, the synthetic reference data are different in two ways. First, the synthetic data are monthly averaged profiles of concentrations from the last year of daily data from a 3-year simulation using the baseline parameter values from

Figure

There are several other important differences between the results of this TSE and the prior TSE summarized in Fig.

To further disentangle the reasons for the differences between the TSEs shown in Figs.

Ultimately, since we are able to recover parameter values in both TSEs across the range of sensitivity values, we do not exclude any parameters in the subsequent model calibrations using data from BATS and HOTS. In general, the results from the TSEs provide confidence that the proposed optimization method will be able to drive the model parameter values in the direction of improved agreement with the observational reference data.

For the single-site parameter estimations at both the BATS and HOTS locations, we used the method described in Sect.

Finally, it should be noted that some optimization and parameter estimation studies include replicate experiments. However, we have chosen not to do such experiments in the present study because of the random nature of the initial global search in the 51-dimension parameter space considered here. That is, with only 25 000 samples in the Latin hypercube step, it is highly likely that we will start the second-phase gradient-based optimizations in the replicate experiments from a completely unique set of parameter values, resulting in different final parameter values. Based on the following parameter estimation results, however, it will be seen that the 25 000 samples in the initial global search are sufficient to ensure that the overall method gives better agreement with the observational data than the baseline values from

Normalized RMSD values (

Normalized absolute differences between the monthly averaged field data from model runs and the corresponding observational data from the BATS site. The top row corresponds to the baseline parameter set from

Figure

Improvements from the baseline to the calibrated model are reflected in Fig.

Taken together, Figs.

These results show that the parameter estimation method outlined in Sect.

Normalized absolute differences between the monthly averaged field data from model runs and the corresponding observational data from the HOTS site. The top row corresponds to the baseline parameter set from

For the parameter estimation at the HOTS location, Fig.

The difference fields in Fig.

The normalized absolute differences for oxygen in the second column of Fig.

The calibrated model gives substantially improved agreement with nitrate observations, with the normalized RMSD decreasing from 130.47 for the baseline model to 0.55 for the calibrated model. The calibrated model also gives more accurate phosphate concentrations, with an annual cycle that includes more seasonality. The phosphate has increased bottom concentrations from November through January, similar to the higher concentrations for November through March seen in the observational data. Similar to all other fields, the calibrated model significantly improves predictions of PON, whereas the baseline model overestimates the observations by a factor of approximately 3.

Overall, the parameter estimation method produced significantly better agreement with the observational data at the HOTS location. Moreover, we were able to produce generally similar errors at the BATS and HOTS sites. Table

We now use the parameter estimation method to calibrate parameters in BFM17 using observational data from the BATS and HOTS locations simultaneously. As with the single-site estimations, we performed an initial search of the parameter space using

The resulting model fields for the BATS location from the multi-site calibration are shown in Fig.

Overall, the agreement between the multi-site-calibrated model and the observational data at the BATS and HOTS locations is quite good, with errors comparable to results from the single-site estimations. The normalized combined model error was 184.14 for the baseline model, lowering to 8.57 after model calibration. At the BATS site, predictions for all of the target fields are closer to the observations for the multi-site-calibrated model than the baseline model. For the HOTS site, the set of estimated parameters from the combined calibration improved agreement for all five target fields when compared to the baseline model and four of the fields when compared to the single-site optimization.

The multi-site-calibration results have slightly larger errors than the single-site optimization at the BATS location. Figure

Table

For the multi-site HOTS results, four of the five target fields (i.e., chlorophyll, oxygen, nitrate, and PON) are in better agreement with the observational data than the single-site results, with the only trade-off being increased error in phosphate. Chlorophyll, as observed in Fig.

Ultimately, although the present study serves primarily as a demonstration of the parameter estimation method, the parameter values from the multi-site calibration, summarized in Table

While generally outside the scope of the present study, the optimized parameter values could be analyzed to better understand the relationship between different ecological processes or the different sites. For example, the background attenuation coefficient,

We have formulated and demonstrated a method for simultaneously estimating a large number of uncertain parameters in complex BGC models, considering multiple state variables and ocean locations. The method is fundamentally based on numerical optimization, whereby the error is reduced between model and observational (or other reference) data. Both gradient-free and gradient-based optimization techniques are incorporated into the method to provide a broad exploration of the parameter space combined with the computational cost savings enabled by local gradient-based approaches. While the broad search and multiple local optimizations do not guarantee that the solution is a global minimum, they do reduce the possibility of becoming artificially trapped in regions of the parameter space based on inaccurate initial guesses by the user while still taking advantage of the computational efficiency of gradient-based methods.

As a demonstration of the method, we estimated the 51 parameters of BFM17

The present demonstration of the parameter estimation method is just one example of the many ways in which the method can be configured. For example, given additional computational resources, a user may choose to expand the number of initial random samples included in the gradient-free search of the parameter space or the number of subsequent gradient-based local optimizations. Even for the relatively modest number of samples and local optimizations used here, we were able to significantly improve model accuracy. In Appendix

Our proposed methodology also provides a general framework for sequentially probing parameter spaces in high-dimensional complex BGC models, followed by local optimizations. It can therefore be adapted in more substantial ways than simply changing particular optimizer configuration options. For example, while we found it computationally infeasible to run a genetic algorithm to convergence for this problem, truncated runs of that class of algorithms could be used instead of Latin hypercube sampling to identify multiple parameter sets that are then used to initiate local optimizations. This and other combinations of approaches are important directions for future study.

This study provides a method for determining the parameter values that provide the best possible fit to observational data, within the constraints of the dynamics represented by the BGC model itself. That is, the present method can be used to calibrate model parameters such that the dynamics represented in the model are the cause for any remaining data misfit. Previous studies have shown how model calibration can be used to determine the required set of dynamics

Finally, the present approach can be extended to replace POM1D with a higher-dimensional and more detailed physical model, such as a global circulation model (GCM). However, even with the cost savings enabled by a smaller BGC model such as BFM17, GCMs would still be extremely expensive to evaluate many tens of thousands of times, as is required even when using a gradient-based parameter estimation approach. It is common in optimization to use surrogate or lower-fidelity models to accelerate the optimization process, even when the intended application of the optimized parameters is a higher-fidelity simulation. In this sense, the current approach effectively employs POM1D as a physics-based, low-cost surrogate for a GCM.

There are a number of different ways that the parameter estimation method can be configured, with different choices of variables in the objective function, formulations of the objective function, and months included. In the following, we explore the effects of each of these choices, with the understanding that the method outlined in Sect.

Phytoplankton chlorophyll (

Due to the specific interest in phytoplankton as a primary producer affecting both the carbon cycle and the food web, we tested single- and multi-site calibrations based exclusively on phytoplankton chlorophyll. Figure

At both locations, Fig.

The multi-site-calibration results for chlorophyll show the way in which the parameter estimation method identifies parameters that balance the system behavior of the targeted communities. Comparing the single-site and multi-site-calibration results, the predictions for the BATS location correspond to greater chlorophyll concentrations at depth, with suppression of phytoplankton growth at the beginning of the year. By contrast, chlorophyll at the HOTS location is concentrated higher in the domain with more seasonality and slightly higher concentrations. Ultimately, model results for one site are skewed towards the behavior of the other site included in the calibration. Additional sites could be included in future work to obtain a more generic set of parameters.

We next examine the impact of changing the objective function in the parameter estimation method, specifically by varying the original formulation of

The first alternative formulation multiplies the squared difference values by the reference values before being cube-rooted, namely

Model results after parameter estimation using the alternative formulations of

In the case of both oxygen and nitrate, the fields produced by each of the formulations of

The PON fields for all formulations of

To compare the results quantitatively, Taylor diagrams with each of the alternative objective functions are shown in Fig.

Phytoplankton chlorophyll (

Taylor diagram comparing model results at BATS for phytoplankton chlorophyll (

Phytoplankton chlorophyll (

To examine the effects of data frequency on the parameter estimation method, we performed three additional calibrations at the BATS location omitting data from 2 or more months during the parameter estimation (all three calibrations used five target fields in the objective function). In the first case, we examined the importance of capturing the initiation of the spring bloom by excluding all data for the months of February and March. This could be thought of as an experiment for data corruption considering the case where data from certain observational periods are unreliable and have to be excluded. In the next two cases we test realistic, if non-ideal, observation strategies where data are (i) collected quarterly in February, May, August, and November and (ii) only collected during the initialization of the spring bloom in February and March.

Taylor diagram comparing model results at BATS for phytoplankton chlorophyll (

Figure

The Taylor diagram in Fig.

These results demonstrate that the annual cycle in the five target fields does not necessarily need to be observed on a monthly basis for optimization results to improve the model fit to the physical trends. Including the full data set did, however, produce the most representative parameter set. The results also highlight the danger of using data that are too sparse. In the data sparsity studies including and excluding the spring bloom, the spring bloom did not produce error measures consistent with the full data set. Calibrating using data only from the spring bloom led to good agreement between the included observational data and model results, but this came at the expense of not being generally representative of the annual cycle. These conclusions highlight the importance of matching the included data to the desired purpose of the optimized model and frequent – or at least even – coverage of the desired dynamics.

In this section, we briefly discuss the computational cost of running the parameter estimations presented in this study. All calibration studies are performed using the computational resources from the Cheyenne Supercomputer sponsored by the National Center for Atmospheric Research. The system features dual-socket nodes of 36 Intel Xeon processor cores. The BFM17

The run time for a single model evaluation is approximately 5 min on a single core. DAKOTA provides the capability to perform multiple model evaluations in parallel. The total CPU time remains significant, but using supercomputing resources allows for drastic reductions in wall time. Table

The computational resources used to perform each parameter calibration study presented in Sect.

Results of the initial sampling and the optimization for each parameter estimation study: BATS

To understand the relative improvement achieved by each stage of the multi-step calibration methodology, the results of the initial samplings and the optimization runs are included in Fig.

For the BATS calibration case, the total normalized error is decreased by 31 % comparing the best parameter set from the 25 000 sampled cases. By performing the subsequent local optimization runs we are able to get a 77 % reduction in error. Simply doing a gradient-based optimization from the baseline parameter values gets us a 60 % reduction in error, which is not insignificant but is less than we were able to obtain.

For the HOTS site, we are able to get a 91 % improvement in results due to the random sampling. This is in part simply due to how poorly we are initially performing at the HOTS site, since all previous work went into making the model representative of the BATS site specifically. This can also be seen in the number of sampled values which are better than the baseline parameter set. For HOTS, 2624 of the randomly sampled parameter sets are better than the baseline simulation, while for BATS only 188 random parameter sets are better than the baseline simulation. The optimization runs are able to further improve agreement by 71 % from the best sampled case. A gradient-based optimization initialized from the baseline parameter set would reduce the accumulated normalized error to 8.94 which is higher than the 4.66 we are ultimately able to achieve by applying both pieces of the calibration methodology.

Similar to the HOTS case, the multi-site calibration is able to achieve significant improvement simply through the course of the random sampling. The sampling produces 2335 random parameter sets better than the baseline parameter set. The normalized error is decreased by 85 % for the best-performing randomly sampled case, as compared to the baseline case. This is still driven mainly by the excessive error in the HOTS field. By performing the optimization runs we are able to further decrease the error for an overall improvement of 95 %, with the improvement from the best sampled case being 70 %. The overall improvement of 95 % exceeds the 91 % which would be achieved just optimizing from the baseline model parameters.

In all cases, the use of the 20 best cases from randomly sampled parameter sets to initialize gradient-based optimizations resulted in improved agreement over that achieved by simply optimizing from the baseline cases. The proposed methodology has been developed in response to constraints of using local, but efficient, optimization methods. We have been concerned about the possibility of falling into local minimums in arbitrary regions of the parameter space. The baseline optimization results having comparable objective function values, even outperforming particular optimization from the sampled parameter sets, emphasizes that we cannot completely rule out the possibility that a parameter set with a higher objective value but in other regions of the parameter space could produce better agreement with observational data than we achieved with the 20 best sampled cases when optimized. In this work we assumed that selecting the sampled cases with the lowest error is the best way for identifying regions with relatively low error. Considering these results, it is worth determining in the future how this methodology could be modified to use a different criterion for selecting the parameter sets used to initialize the optimization runs. For example, instead of only using the error-based objective function, one could incorporate some measure of distance to the selection criterion to ensure that we are getting reasonable coverage of the parameter space. This is not done in this work, but it is worth highlighting and should be considered for future work. The major challenge will be determining how to ensure sufficient coverage of the parameter space for such a high-dimensional parameter space. This challenge motivated the random sampling proposed in the current methodology.

The zooplankton LFG is treated as carnivorous, and, consequently, the sole source of growth for the LFG is its predation on phytoplankton. As carbon, nitrogen, and phosphorous are lost by phytoplankton from predation, all three constituent pools increase for zooplankton. Zooplankton is a living organism, so there are carbon losses resulting from respiration as part of the organism's metabolic activity. The zooplankton losses resulting from egestion, excretion, and mortality are parameterized as releases to the dissolved and particulate organic matter pools for all three constituent components. Nitrogen is also released to ammonia, while phosphorous is released to phosphate.

As noted, the non-living dissolved organic matter increases from phytoplankton losses due to lysis and releases from zooplankton. The dissolved organic carbon also increases from phytoplankton exudation. Dissolved organic nitrogen can be lost as a result of phytoplankton uptake of nitrate and ammonium. Similarly, dissolved phosphorous can be lost as a result of phytoplankton uptake of phosphate. Non-living particulate organic matter has a more uniform behavior across the three constituent components. In all cases, the particulate matter results from the lysis by phytoplankton and the release of organic matter from zooplankton.

Instead of non-living organic matter being recycled back to the inorganic nutrient pools through a bacterial loop, BFM17 uses a constant remineralization rate closure. Matter is cycled directly back to the inorganic nutrient pools based on the product of a constant rate and the non-living organic matter concentrations. Carbon is also remineralized back to carbon dioxide, but since the inorganic dissolved gas acts as a sink it is not being tracked in this model implementation.

Oxygen is the only dissolved gas that BFM17 explicitly tracks. Oxygen is introduced into the system via aeration of the surface water resulting from wind forcing calculated with observational data. The production of oxygen by phytoplankton during photosynthesis is the only biological source of oxygen. Oxygen is consumed during phytoplankton and zooplankton respiration as well as the recycling of non-living dissolved and particulate organic carbon to carbon dioxide. Oxygen is also lost to nitrification, a process that converts ammonium to nitrate.

Phosphate, nitrate, and ammonia are consumed by phytoplankton. Phosphate and ammonia are replenished through the release of phosphorous and nitrogen, respectively, by zooplankton. Phosphate and ammonia also receive matter from the remineralization of dissolved and particulate organic matter. Remineralization only returns nitrogen to the ammonium pool from which the nitrate pool is replenished via nitrification. During nitrification, nitrogen from ammonia is combined with oxygen.

List of BFM17 parameters controlling the marine ecosystem dynamics in the model.

List of estimated BFM17 parameters controlling the marine ecosystem dynamics in the model.

All codes and data necessary to reproduce the results in this paper have been archived on Zenodo. The parameter estimation is performed by coupling the BFM17

SK, KMS, NP, and PEH developed the optimization-based parameter estimation method; MEM developed the HOTS test case; KEN provided input on the numerical optimization approach; NLS and NP provided guidance on data sources and the physical interpretation of results; SK performed all parameter estimations and produced all results presented in the paper; SK and PEH prepared the initial draft of the paper; and all authors edited the paper to produce the final version.

The contact author has declared that none of the authors has any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

This material is based upon work supported by the NSF under grants OCE-1924636 and OCE-1924658. The authors would also like to acknowledge high-performance computing support from Cheyenne (

Skyler Kern was supported by the ANSEP Alaska Grown Fellowship and by the National Science Foundation (NSF Graduate Research Fellowship Program award). This research has been supported by the National Science Foundation (grant nos. OCE-1924636, OCE-1924658, and NSF 18-573).

This paper was edited by Riccardo Farneti and reviewed by two anonymous referees.