When working with Earth system models, a considerable challenge that arises is the need to establish the set of parameter values that ensure the optimal model performance in terms of how they reflect real-world observed data. Given that each additional parameter under investigation increases the dimensional space of the problem by one, simple brute-force sensitivity tests can quickly become too computationally strenuous. In addition, the complexity of the model and interactions between parameters mean that testing them on an individual basis has the potential to miss key information. In this work, we address these challenges by developing a biased random key genetic algorithm (BRKGA) able to estimate model parameters. This method is tested using the one-dimensional configuration of PISCES-v2_RC, the biogeochemical component of NEMO4 v4.0.1 (Nucleus for European Modelling of the Ocean version 4), a global ocean model. A test case of particulate organic carbon (POC) in the North Atlantic down to 1000 m depth is examined, using observed data obtained from autonomous biogeochemical Argo floats. In this case, two sets of tests are run, namely one where each of the model outputs are compared to the model outputs with default settings and another where they are compared with three sets of observed data from their respective regions, which is followed by a cross-reference of the results. The results of these analyses provide evidence that this approach is robust and consistent and also that it provides an indication of the sensitivity of parameters on variables of interest. Given the deviation in the optimal set of parameters from the default, further analyses using observed data in other locations are recommended to establish the validity of the results obtained.
The field of Earth science has garnered much interest in recent years due to anthropogenic-driven climate change and the increasing urgency to implement policies and technologies to mitigate its effects. As a result, Earth system models (ESMs) have become a fundamental tool for studying the impact of shifting climate dynamics and global biogeochemical cycles
The tool presented here can be applied to any ESM component, although this work focuses on ocean biogeochemistry because of the many unconstrained parameters that are usually needed to numerically represent this realm of the Earth system. In particular, we focus on key biogeochemical processes that contribute to the oceans' capacity to absorb carbon dioxide from the atmosphere and potentially store it. These processes, usually referred to as the biological carbon pump, are dominated by the vertical transport of organic matter from the surface of the ocean to deeper layers
Ocean biogeochemistry models (OBGCMs) simplify the complexity of the real world by representing biological processes with empirical functions
In the effort to achieve simple yet universally applicable models, parameter optimisation (PO) techniques are a key tool, as they provide an objective means to find a model parameter set that produces outputs that match well with observed datasets. However, PO (often referred to as tuning) has traditionally been a rather subjective process, in that the model developers choose the best parameter sets from a somewhat comprehensive array of alternative model runs. Such subjective optimisation often relied on sensitivity analyses, whereby the variations in model output variables, and their skill, were quantified by perturbing one parameter at a time. Given the high computational cost of 3D OBGCM simulations, subjective criteria are still widely used to optimise OBGCMs. A promising alternative is to perform PO using one-dimensional (1D) model configurations, which deal only with local sources and sinks and vertical fluxes along the water column
Attempting to constrain parameters using optimisation techniques can be difficult in situations of inadequate data or computing power
This paper documents the application of a genetic algorithm to determine an ideal set of parameters that accurately simulate the behaviour of the biogeochemical component (PISCES-v2_RC) of an ocean model. The overall aim of this investigation is to demonstrate that using computational intelligence techniques, a biased random key genetic algorithm (BRKGA) in our case, for parameter estimation in Earth system models is an effective approach and to explore, via a BRKGA, how this can be implemented. We also describe how to implement a BRKGA and how to embed it in a state-of-the-art ocean model using a workflow manager
This section outlines the main methods used in this investigation. A test case of particulate organic carbon (POC) in the North Atlantic down to 1000 m is used. The observed data, explained in detail in Sect.
The type of GA used is BRKGA
This paper outlines two test case experiments in which the reference data are an output of a simulation with default parameters, another three in which the reference data are observed data from three locations in the North Atlantic, and, last, a set of cross-experiments. Section
Our investigation focuses on the vertical profiles of POC in the Labrador Sea region of the North Atlantic subpolar gyre. The observed data were acquired by Argo floats deployed in the context of the international Argo programme
To enable comparison between biogeochemical (BGC)–Argo data and model simulations, we developed a framework that is described in detail in the companion paper by
LAB1 – float 6901527, year 2016, and LAB2 – float 6901527, year 2014, and LAB3 – float 6901486, year 2015, and
Finally, we matched the trajectory of the float on a given year to the NEMO model ORCA1 grid (ca. 1
PISCES-v2
In PISCES, detrital POC is represented by two tracers, i.e. POC for detritus smaller than 100
Our study focuses on nine PISCES parameters (Table
Definitions of the PISCES parameters included in the optimisation experiments, along with their default values, optimisation ranges, and units.
This investigation uses PISCES configured for one spatial dimension (1D) and to run offline
Being one-dimensional, the model only requires one computational core and runs at a speed of roughly 1 simulation year per minute on a supercomputer, which allows for multiple simulations to be run in parallel. The numerical parameters that will be constrained are stored in text files called name lists and can be easily modified prior to each simulation without requiring recompilation. In the experiments (Sect.
A GA is a type of evolutionary algorithm used for optimisation that, in general, is analogous to natural selection in the sense that a population of
Another feature inspired by genetics is the concept of mutations. The purpose of mutations is to make the algorithm more exploratory by randomly changing or perturbing parts in individual members or adding randomly generated individuals to the population. This is usually done with a very small probability, emulating transcription errors that occur within natural gene passing.
Once the crossovers are completed and the new generation is made, their strength is again measured and the process is repeated. This continues until a certain condition is met. This can be whenever the value of the cost function of the strongest member reaches a certain value, or if no change is noted after a certain number of generations, or simply after a predetermined number of generations.
A BRKGA is a particular type of GA in which each gene is a vector of floats rather than a bitstring, which is typical of traditional GAs
A visualisation of the BRKGA's process from one generation to another
Deciding on an ideal cost function to measure the misfit between the results of each simulation and the observed data requires a number of considerations. In this case, the limitations of the model itself and the particular properties of the data need to be taken into account. An important model limitation is that there exist inherent physical biases and, in some cases, uncertainties in the conversion factor between the model variable and its observed counterpart. In addition, we wish to compare trends, in particular the seasonality of the data. For this, simply calculating the difference between observed data and simulated outputs, or bias, is not sufficient.
To ensure sensible fitting, in addition to bias, the correlation and the normalised standard deviation need to be considered. The root mean square error, RMSE, is a widely used parameter in this type of investigation; however, in certain cases, it has been found to reward reductions in model variability, for example, over the seasonal cycle
Running a BRKGA requires performing a number of iterations until a termination condition is achieved. This does not represent a technical challenge if the fitness function can be calculated directly from the generation members. However, in some cases, such as the one presented in this work, an external model is responsible for calculating the result that will be the input to the cost function. As a consequence, the need for the parallel execution and management of many different and interdependent tasks requires using tools called workflow managers or metaschedulers, which are commonly used to run ensemble experiments with climate models. Here we use a state-of-the-art workflow manager called Autosubmit
Autosubmit experiments are hierarchically composed of start dates, members, and chunks. A single experiment can run different start dates that can be divided into members in which each member contains an individual simulation. This feature was added to facilitate ensemble forecasts. In addition, each member is usually divided into different sequential chunks in order to save checkpoints of the model state at regular intervals. With these features, Autosubmit has the ability to run multiple members in parallel and therefore is suitable to run a GA in which there are different individuals in the same generation. This allows the size of the experiment to be adjusted easily and many different quantities of population and generations to be tested with ease. The use of Autosubmit to facilitate multiple instances of a computational model in a BRKGA is a novel one. One shortcoming of this method, however, is that the workflow size is static, and there is no feature to terminate the experiment after a certain condition is met. This means that the only viable stopping condition of the BRKGA is after a predetermined number of generations; otherwise, the stopping condition would have been if no evolution is observed after a certain number of generations.
Our particular workflow consists of three different types of job. The first is the initialisation of the experiment and is only run once at the very beginning of the experiment. The second is the simulation, which is run once per individual in parallel in each generation. Finally, the postprocessing, which includes the crossover, is run once per generation. An example of a workflow for a toy experiment of four populations and four generations is shown in Fig.
An example of the Autosubmit workflow.
The initialisation script starts by setting up the directory in which the simulations are run by copying the executable of the model and the necessary input files into it. Included within the initialisation is a simulation run with a vector of the default parameters, and certain statistical measurements between its output and the observed data are taken that are necessary for postprocessing and calculation of the cost function. Finally, the script generates the initial set of vectors at random.
The second script, which runs
The final script runs once per generation after all simulations of the respective generation are completed. First, it reads the cost function statistics calculated after each simulation and uses them to calculate the ST score of SPOC and LPOC. It then ranks each of the simulations according to the sum of the two ST scores. Then it performs the crossover, as described in Sect.
To investigate the potential of the BRKGA, different sets of experiments are run. Each set contains five experiments (to test consistency and robustness) with distinct and randomly generated initial populations, with 100 individual simulations over 100 iterations. Their details are summarised in Table
A summary of the experiments run using the workflow.
Initially, we determine the capabilities of the BRKGA by testing how well it can find a known set of parameters. To do this, experiment sets D9 and D5 are run using the output of a simulation with the default parameters being the reference data at location LAB1. In set D9, nine parameters are tested to check which ones can be constrained from SPOC and LPOC data. This leads us to select five parameters, which are tested in set D5 and, additionally, give us an indication of how the method behaves when different sizes of vectors of parameters are used.
Experiment set O5_LAB1 uses the BRKGA as intended, where the reference data are observed data from LAB1, and the outputs are analysed. This is further compared with experiment sets O5_LAB2 and O5_LAB3, which are run in LAB2 and LAB3, respectively. This is to investigate how the results obtained reflect the wider region.
Finally, cross-simulations are run, whereby a representative vector of parameters from each of the experiment sets O5_LAB1, O5_LAB2, and O5_LAB3 is selected to run a single simulation in the other two locations. This is to further check how robust the BRKGA is and if the vectors produced are representative of the region. In fact, a certain homogeneity is expected across the three locations because of their similar physical and biogeochemical properties. The BRKGA not capturing this homogeneity would suggest that the tool is compensating for other errors in the attempt to minimise the cost function resulting in an overfitting of the optimal vector of parameters.
The evolution of the optimal sets of parameters in experiment set D9 is presented in Fig.
The results of experiment D9, plus additional analyses that we report in Appendix A, provided the criteria to select the five parameters that were used in subsequent PO experiments. Quite obviously, the
Figures
Evolution of each generation's optimal set of parameters in experiment set D9.
Evolution of each generation's lowest ST score for experiment set D9.
Evolution of each optimal generation's bias, normalised standard deviation, correlation, and RMSE of experiment set D9.
The following plots analyse the results of experiment set D5. The evolution of the optimal vector of parameters from each generation is presented in Fig.
Experiment set D5 was additionally compared with a similarly structured experiment set that used a random search algorithm to verify the better efficacy of the BRKGA. The results of this comparison are in Appendix C.
Evolution of each generation's optimal set of parameters for experiment set D5.
Evolution of each generation's lowest ST score in experiment set D5.
Evolution of each generation's optimal bias, normalised standard deviation, correlation, and RMSE in experiment set D5.
Figure
Evolution of each generation's optimal set of parameters for experiment set O5_LAB1.
Evolution of each generation's lowest ST score of experiment set O5_LAB1. The label of default refers to the cost function of the default simulation.
Evolution of each generation's optimal bias, normalised standard deviation, correlation, and RMSE of experiment set O5_LAB1.
The top row shows data plots of SPOC in log scale, for (left–right) observed data, the default parameter set's model output, and the optimised parameter set's model output. The bottom row shows the biases between the model outputs and observed data for the default parameter set (left) and optimised parameter set (right). Mean biases of the default and the optimised parameter sets are shown in Fig.
The top row shows data plots of LPOC in log scale, for (left–right) observed data, the default parameter set's model output, and the optimised parameter set's model output. The bottom row shows the biases between the model outputs and observed data for the default parameter set (left) and optimised parameter set (right). Mean biases of the default and the optimised parameter sets are shown in Fig.
The experiments producing the median cost function for each set of O5_LAB1, O5_LAB2, and O5_LAB3 are presented in Table
The final parameter sets of three genetic algorithm experiments run in three locations, along with the default.
Comparison of SPOC absolute bias and correlation of 12 single simulations run by crossing the four parameter sets (the default and three optimised sets produced by the BRKGA at three locations) with three locations. Values in italics mark the diagonals with equal locations and parameter sets.
Comparison of LPOC absolute bias and correlation of 12 single simulations run by crossing the four parameter sets (the default and three optimised sets produced by the BRKGA at three locations) with three locations. Values in italics mark the diagonals with equal locations and parameter sets.
A set of experiments was designed to test the potential of a newly developed BRKGA. As a validation, the BRKGA was first tested against the output of a simulation produced with known default parameter settings. For the first set of experiments, we chose nine parameters, expressed as a vector, insuring a broad selection. This guided our choosing of parameters that could be constrained with confidence from the evaluated variables (in this case, SPOC and LPOC). In addition, in this set (and all others in the paper) five identical experiments were run at a time, and all results were similar to each other; this indicates that this method behaves consistently and reliably. The next set of experiments was identical to the previous set, except that there were only five parameters selected from the initial nine-parameter vector. This particular set of experiments produced results that were closer to the result of the default parameter vector with less computation. This leads us to believe that the size of the experiment required is dependent on the size of the parameter vector. One of the main contributions of this work is to use a state-of-the-art ocean model as a prior step to the calculation of the fitness function, with all the complexity that this option entails. This is only possible because of the aforementioned availability of computing power, and it is also highly facilitated by the usage of advanced scientific workflow solutions that allow the integration of the model executions in the evolutionary workflow
After the experiments against default data, the BRKGA was then tested by using observed data from ocean floats in the North Atlantic as the reference data. A set of five experiments was run for each BGC–Argo float annual time series, using the same settings as in the previous set. From Fig.
Another concern that arises from the results is the need to carefully evaluate the behaviour of the cost function. This is well illustrated by Figs.
Further work quantifying the effectiveness of the cost functions across different situations would probably improve the efficacy of the BRKGA. Yet, it must be highlighted that the test case chosen to evaluate the BRKGA is an exigent one because model skill was already very good with the default parameters, even though PISCES was not originally tuned to fit these particular observations. Ongoing work with a different optimisation case indicates that the BRKGA can produce larger and simultaneous improvements in all skill metrics when starting from a state of very poor model performance, which is, in this case, the seasonal cycle of sea surface chlorophyll
As a further test of our approach, two more sets of experiments were carried out in different locations in the Labrador Sea, resulting in a reasonable consistency of the optimal parameter set across the region. To confirm this, the optimal parameter sets for the three locations were cross-referenced by using each parameter set in each of the other two locations in single simulations. Results from this cross-testing suggest that the parameters produced have the potential to be representative of the region or even exchangeable among multiple locations (Table
The optimisation of PISCES parameters against BGC–Argo presented in our study illustrates how PO can help us understand a dynamical system better. Here we will briefly discuss the lessons learnt from the O5 experiments, while keeping in mind that a detailed review of PISCES parameter values and their biogeochemical implications is beyond the scope of this paper. It is also noteworthy that the interpretation provided here draws only from the analysis of the best-performing parameter set in each BRKGA experiment. Full exploitation of the results, with thousands of alternative model realisations, could yield further insights on how parameters interact in a space constrained by optimal model performance.
In the three O5 experiments, wsbio converged to values between 0 and 1 m d
Interpretation of the evolution of the grazflux parameter is more complex. The flux-feeding rate depends on the product of mesozooplankton biomass, grazflux, and particle sinking speed. In PISCES, a fraction of the intercepted
The selection of a subset of model parameters is a common limitation of PO experiments, and although we based it on objective criteria, we acknowledge that it remains somewhat arbitrary. The stepwise reduction in the number of parameters from nine to five obeys the need to assess the GA performance with a varying number of parameters and also to reduce the degrees of freedom, given that only two variables were used as reference observations. Among the excluded parameters, wsbio2 certainly deserves examination in future experiments, given its primary control on the fate of
Mesopelagic POC dynamics provide a relevant optimisation case because of their role in oceanic carbon sequestration
The GA developed shows potential in effectively constraining the parameters of the NEMO–PISCES ocean biogeochemistry model in a way that can be extended to similar models. Our GA is embedded in the workflow manager Autosubmit, which seamlessly handles thousands of individual simulations alongside the GA calculations. This key feature makes the process of objective parameter optimisation automatic, reproducible, and portable across different high-performance computing platforms.
We proposed an experimental protocol that consists of two main phases. First, the optimisation runs against the output of the default model, whose parameter values are known beforehand, to identify the parameters that can be effectively constrained when the evaluation data can be perfectly matched by the model. Second, the subset of selected parameters is optimised against the actual observations. This protocol increases the efficiency and robustness of the optimisation by reducing the parameter space.
Based on the experience acquired through the development of this tool, we make the following three main recommendations that can maximise the efficacy of the GA for a given research problem:
It may be necessary to adjust the GA metaparameters to optimise the balance between convergence speed and parameter space exploration. The cost function has to be selected, keeping in mind the tradeoffs between bias, dispersion, and pattern (correlation) statistics, and a single formula is unlikely to serve all purposes equally well. Realistic parameter bounds are key to ensure that the results produced are sensible from a scientific point of view, and the optimisation results have to be critically evaluated a posteriori.
The use of POC estimates from BGC–Argo floats for the optimisation of biogeochemical parameters is a novel approach, as previous studies generally used target variables such as chlorophyll
This Appendix shows the rates of detrital POC production (sources) and consumption (sinks), as represented in PISCES equations, with default parameters, over the annual cycle in the Labrador Sea between the surface and 1000 m depth. The magnitude of the
Sources (red) and sinks (blue) of the PISCES tracer
Sources (red) and sinks (blue) of the PISCES tracer
This Appendix shows the sensitivity of the
Combined sensitivity of the PISCES
Combined sensitivity of the PISCES
Combined sensitivity of the PISCES
To evaluate the effectiveness of the BRKGA, we compared the results of experiment set D5 to that of a random search algorithm, D5_rand, which is identical to D5, except that every parameter set in every generation is generated at random, with the exception of the most elite one from the previous generation. To compare the two sets of experiments, the experiment with the median cost function is considered in both cases, and the absolute difference between the final parameters and the default ones are calculated (Table
Comparison of the meta-analyses of experiment sets D5 and D5_rand. The top row, the median, compares the statistics of the experiments with the median cost function in each case. The bottom row, the SD, is the standard deviation of all experiments of each statistic in each case.
Absolute differences between the final parameter set of the median experiments of sets D5 and D5_rand and the default parameter set.
The code of NEMO v4.0.1 and PISCES-v2_RC are publicly available at
These data were collected and made freely available by the international Argo programme and the national programmes that contributes to it (
MF wrote the paper, with contributions from all co-authors. MC and MA contributed to the topics of computing and GAs. MG, RB, and JL contributed to the topics of ocean biogeochemistry. MF developed the code of the workflow of the GA (including the implementation of the GA itself and the configuration of the workflow manager) and developed the code to produce the figures and the data in the tables. MC developed the configuration of NEMO-PISCES for the MareNostrum 4 HPC platform and provided guidance to maximise the workflow efficiency. MG configured the PISCES 1D offline simulations and processed the observed data from the BGC–Argo floats. MG and RB conceived the study.
The contact author has declared that none of the authors has any competing interests.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The simulations analysed in the paper were performed using the internal computing resources available at the Barcelona Supercomputing Center. The authors acknowledge the support of Pierre-Antoine Bretonnière and Margarida Samsó, for downloading and storing the Argo data, Daniel Beltran and Wilmer Uruchi, for their technical support with Autosubmit, Hervé Claustre, for guidance on BGC–Argo data processing, Olivier Aumont, for guidance on PISCES-v2_RC structure and parameters, Xavier Yepes, for guidance on the BRKGA, and Thomas Arsouze, for technical support with the PISCES 1D configuration. Martí Galí has received financial support through the Postdoctoral Junior Leader Fellowship Programme from the La Caixa Banking Foundation (ORCAS project; grant no. LCF/BQ/PI18/11630009) and through the OPERA project funded by the Ministerio de Ciencia, Innovación y Universidades (grant no. PID2019-107952GA-I00). Raffaele Bernardello acknowledges support from the Ministerio de Ciencia, Innovación y Universidades, as part of the DeCUSO project (grant no. CGL2017-84493-R). The authors thank Urmas Raudsepp and one anonymous referee, for their constructive comments that improved the original work.
This research has been supported by the Fundación Bancaria Caixa d'Estalvis i Pensions de Barcelona (grant no. LCF/BQ/PI18/11630009) and the Ministerio de Ciencia e Innovación (grant nos. PID2019-107952GA-I00 and CGL2017-84493-R).
This paper was edited by Olivier Marti and reviewed by Urmas Raudsepp and one anonymous referee.