Optimality-Based Non-Redfield Plankton-Ecosystem Model (OPEM v1.0) in the UVic-ESCM 2.9. Part II: Sensitivity Analysis and Model Calibration

We analyse 400 perturbed-parameter simulations for two configurations of an optimality-based plankton-ecosystem model (OPEM), implemented in the University of Victoria Earth-System Climate Model (UVic-ESCM), using a Latin-Hypercube sampling method for setting up the parameter ensemble. A likelihood-based metric is introduced for model assessment and selection of the model solutions closest to observed distributions of NO3 – , PO4 – , O2, and surface chlorophyll a concentrations. 5 According to our metric the optimal model solutions comprise low rates of global N2 fixation and denitrification. These two rate estimates turned out to be poorly constrained by the data. For identifying the “best” model solutions we therefore also consider the model’s ability to represent current estimates of water-column denitrification. We employ our ensemble of model solutions in a sensitivity analysis to gain insights into the importance and role of individual model parameters as well as correlations between various biogeochemical processes and tracers, such as POC export and the NO3 – inventory. Global O2 varies by 10 a factor of two and NO3 – by more than a factor of six among all simulations. Remineralisation rate is the most important parameter for O2, which is also affected by the subsistence N quota of ordinary phytoplankton (Q0, phy) and zooplankton maximum specific ingestion rate. Q0, phy is revealed as a major determinant of the oceanic NO3 – pool. This indicates that unraveling the driving forces of variations in phytoplankton physiology and elemental stoichiometry, which are tightly linked via Q0, phy, is a prerequisite for understanding the marine nitrogen inventory. 15

Abstract. We analyse 400 perturbed-parameter simulations for two configurations of an optimality-based planktonecosystem model (OPEM), implemented in the University of Victoria Earth System Climate Model (UVic-ESCM), using a Latin hypercube sampling method for setting up the parameter ensemble. A likelihood-based metric is introduced for model assessment and selection of the model solutions closest to observed distributions of NO − 3 , PO 3− 4 , O 2 , and surface chlorophyll a concentrations. The simulations closest to the data with respect to our metric exhibit very low rates of global N 2 fixation and denitrification, indicating that in order to achieve rates consistent with independent estimates, additional constraints have to be applied in the calibration process. For identifying the reference parameter sets, we therefore also consider the model's ability to represent current estimates of water-column denitrification. We employ our ensemble of model solutions in a sensitivity analysis to gain insights into the importance and role of individual model parameters as well as correlations between various biogeochemical processes and tracers, such as POC export and the NO − 3 inventory. Global O 2 varies by a factor of 2 and NO − 3 by more than a factor of 6 among all simulations. Remineralisation rate is the most important parameter for O 2 , which is also affected by the subsistence N quota of ordinary phytoplankton (Q N 0,phy ) and zooplankton maximum specific ingestion rate. Q N 0,phy is revealed as a major determinant of the oceanic NO − 1 Introduction Earth system climate models (ESCMs) are powerful tools for analysing variations in climate, while resolving interdependencies between changes in the atmosphere, on land, and in the ocean (Flato, 2011;Prinn, 2013). In this regard, the dynamics of marine ecosystems is a critical link. On long timescales, it regulates atmospheric CO 2 on the basis of biotic uptake of carbon dioxide (CO 2 ) over vast oceanic regions and due to the export of photosynthetically fixed carbon into the deep ocean, which affects the Earth's climate (Reid et al., 2009;Sigman and Boyle, 2000). Planktonecosystem models are widely applied to understand marine biogeochemical cycles, by estimating fluxes of major elements, e.g. nitrogen, phosphorus, and carbon, as well as the sources and sinks of marine oxygen (Maier-Reimer et al., 1995;Six and Maier-Reimer, 1996;Schmittner et al., 2005;Bopp et al., 2013;Vallina et al., 2017;Everett et al., 2017;Ward et al., 2018).
The basic structure of most marine ecosystem models has been designed for resolving mass fluxes between nutrients, phytoplankton, zooplankton, and detritus, typically referred to as NPZD models. Mathematical formulations that describe growth and fate of marine phytoplankton and zooplankton biomass have been successfully applied over a range of scales, from local 0-D ecosystem models (e.g. Fasham et al., 1990;Edwards, 2001) to global 3-D models (Sarmiento et al., 1993;Keller et al., 2012;Nickelsen et al., 2015). However, most of these NPZD models lack a sound mechanistic foundation, preventing them from explicitly accounting for the organisms' regulation of their internal physiological state. For example, N 2 fixation by algae is often diagnosed from 4692 C.-T. Chien et al.: Optimality-based Earth system model -Part 2 the availability of dissolved nutrients, so that it only occurs when the ratio of nitrate to phosphate concentrations falls below the Redfield ratio of 16 : 1 (Deutsch et al., 2007;Ilyina et al., 2013). As these assumptions neglect a number of environmental and ecological controls (e.g. grazing, often also temperature), they do not adequately describe the behaviour of plankton organisms and their sensitivity to changes in their environment. With the introduction of refined mechanistic (physiological) descriptions, we aim here at alleviating this deficiency. In this study, we introduce a new marine ecosystem model coupled to the University of Victoria Earth System Climate Model (UVic-ESCM, based on the configurations of Keller et al., 2012;Getzlaff and Dietze, 2013;Nickelsen et al., 2015). Doing so, we anticipate the model not only to provide improved mass flux estimates but also to exhibit more realistic sensitivities of these fluxes to varying climate conditions, e.g. in simulations of the Last Glacial Maximum or in future projections.
In order to better represent plankton physiology, the new ecosystem model relies on optimality-based considerations for phytoplankton growth, including N 2 fixation , as well as zooplankton behaviour (Pahlow and Prowe, 2010). These two optimality-based models have been shown to be superior to traditional model approaches in reproducing phytoplankton and zooplankton growth and grazing under various environmental conditions (e.g. Fernández-Castro et al., 2016). Our new ecosystem model, the optimality-based planktonecosystem model (OPEM v1.1) coupled to the UVic-ESCM, offers new features and it improves the representation of some biogeochemical properties on the global scale (e.g. net community production (NCP) and particulate C : N : P in the surface water; see Part 1; Pahlow et al., 2020). One of the novel features is the representation of variable quotas of carbon (C), nitrogen (N), and phosphorus (P) in ordinary phytoplankton, diazotrophs, and particulate organic matter (detritus) exported to the deep ocean. This model approach yields mass flux estimates with spatial and temporal variations in the elemental C : N : P stoichiometry of both inorganic nutrients and organic matter as observed in situ (Loh and Bauer, 2000;Martiny et al., 2013b). PELA-GOS (Vichi et al., 2007), the only ocean model with variable C : N : P in phytoplankton in CMIP5 (Bopp et al., 2013) and CMIP6 (Arora et al., 2020), has no diazotrophs; others either have only variable N : P (TOPAZ2, Dunne et al., 2013) or variable C : P (MARBL; Danabasoglu et al., 2020). While some of the existing models have a variable C : N : P based on the optimality-based model for phytoplankton growth (Kwiatkowski et al., 2018(Kwiatkowski et al., , 2019, optimality-based N 2 fixation or zooplankton behaviour are not included. Here, we analyse the new model's performance and evaluate model-ensemble results against observations. Since the model is based on plankton-organism physiology, it includes new parameters whose values have not been estimated for global model applications. Also, we set up two configura-tions, OPEM and OPEM-H, with different temperature dependencies for diazotrophs, to investigate the effects of different empirical temperature functions on distributions of diazotrophs and N 2 fixation. Our analysis relies on ensembles of solutions of the two different model configurations, where every single simulation within each ensemble is subject to a different combination of parameter values. The ensembles allow for assessing the sensitivity of biogeochemical tracer distributions and budgets to variations of the model's parameters. We introduce a likelihood-based metric that quantifies the global misfit between model results and observations. Amongst the ensemble simulations, we regard those model solutions as the best that yield low misfits according to the metric and are also close to current estimates of watercolumn denitrification. The specific objectives of the present paper are (1) to identify and compare those model solutions that correspond to the best representation of observed tracer concentrations and (2) to specify the sensitivity of simulations to variations of the model's parameter values. We make inferences about the model's overall behaviour, especially focusing on data constraints, limitations, and advantages of resolving variable C : N : P stoichiometry for estimations of global net primary production (NPP), NCP, biogenic C export, and the global O 2 , N, and C inventories. OPEM has been implemented into the UVic-ESCM (Weaver et al., 2001;Eby et al., 2013), version 2.9, in the configuration of Nickelsen et al. (2015) with the isopycnal diffusivity modifications by Getzlaff and Dietze (2013), vertically increasing sinking velocity of detritus (Kriest, 2017), and several bug fixes (some of which were already introduced by Kvale et al., 2017). The UVic-ESCM comprises three components including a simple one-layer atmospheric energy-moisture balance model (Weaver et al., 2001), a terrestrial model and a three-dimensional general ocean circulation model. The horizontal resolution of the land and ocean model components is 1.8 • latitude × 3.6 • longitude, and the ocean has 19 vertical levels with a thickness ranging from 50 m in the surface layer to 590 m in the deep ocean. The OPEM and its implementation into the UVic-ESCM are described in Part 1 (Pahlow et al., 2020). Briefly, the major new features of the new model include (1) an optimalitybased model of phytoplankton growth and diazotrophy with variable C : N : P stoichiometry , (2) the optimal current-feeding model for zooplankton (Pahlow and Prowe, 2010), and (3) variable stoichiometry in detritus. The focus on physiology in the construction of the OPEM enables us to study how biogeochemical tracer distributions and fluxes respond to different assumptions about plankton physiology.

Simulation setup
Our setup comprises ensembles of 400 simulations for each of two model configurations that differ in how temperature affects diazotrophy. The original temperature dependence of diazotrophs (f dia (T )) in the UVic-ESCM (and other models, e.g. Aumont et al., 2015), which we also employ for the OPEM configuration, limits both growth and N 2 fixation of diazotrophs to above 15 • C: where T is seawater temperature. In the OPEM-H configuration, the temperature dependence of nitrogenase activity in terrestrial systems by Houlton et al. (2008) is implemented as affecting only N 2 fixation: while growth and nutrient uptake of diazotrophs follow the same temperature dependence as ordinary phytoplankton (see Part 1; Pahlow et al., 2020). Both of these equations are empirical functions directly simulating expected or observed temperature dependencies of N 2 fixation. We consider Eq.
(2) more realistic and hence analyse its effect on model behaviour. However, since the parameters in these two equations have no clearly identifiable physiological meaning, we consider a sensitivity analysis of the parameters in Eqs. (1) and (2) beyond the scope of the present study. Note that some models do not enforce any temperature limitation on nitrogen fixation (e.g. Dunne et al., 2013;Ilyina et al., 2013;Jickells et al., 2017). In the present ocean, waters colder than about 15 • C are generally replete with fixed inorganic nitrogen. For existing parameterisations of N 2 fixation, which are functions of the nitrate deficit with respect to phosphate, there has been little indication of substantial impacts of the formulation of temperature control at low temperatures on the distribution of nitrogen fixation (Somes and Oschlies, 2015;Landolfi et al., 2017). Such differences in formulation may, however, gain importance in environmental conditions different from today's. For all simulations, we impose pre-industrial (AD 1850) boundary conditions with a CO 2 concentration of 284 ppm. The models have been integrated over a period of at least 10 000 years, until they reached steady state.
The 400-parameter combinations are obtained via Latin hypercube sampling (LHS) (McKay et al., 1979). We vary 15 parameters in total, within the variational ranges shown in Table 1, which are based on reference ranges according to literature values. In order to reduce the number of possible parameter combinations, we vary nutrient affinities for macronutrient uptake and half-saturation concentration for iron uptake for ordinary phytoplankton and diazotrophs in constant proportions (A 0 : A 0,D = 4 : 3, K Fe : K Fe,D = 1 : 2), so that diazotrophs have a lower nutrient affinity  and higher Fe half-saturation concentration (Dutkiewicz et al., 2012;McGillicuddy, 2014;Ward et al., 2013) than ordinary phytoplankton. Since our parameter sets are independent of each other, the simulations can be carried out in parallel. Apart from the computational time, the parallel setup with different parameter combinations has some advantages compared to iterative model calibration approaches, e.g. parameter optimisation: (i) individual model simulations do not depend on any other (i.e. previous) combinations of parameter values, (ii) the ensemble results can always be reevaluated with different metrics, perhaps with substantial differences between selected "best" solutions, depending on the error model applied, and (iii) the ensembles provide insight on the sensitivities and thus on uncertainties of particular model results with respect to parameter variations.

Sensitivity analysis
The sensitivity (Sensitivity T ) of a tracer T to a parameter P is defined here as where the indicates the change and the overbar the ensemble mean of P or T . If Sensitivity T < 0, the tracer and the parameter vary in opposite directions. We evaluate the sensitivities of globally and annually averaged NPP, NCP, nitrogen fixation by diazotrophs (N 2 fixation), and the concentrations of oxygen (O 2 ), nitrate (NO − 3 ), dissolved inorganic carbon (DIC), dissolved and particulate iron (DFe and PFe), Chl, ordinary phytoplankton, diazotrophs, particles (combinations of ordinary phytoplankton, diazotrophs, zooplankton, and detritus), and their elemental stoichiometry to the parameters listed in Table 1. We also evaluate the sensitivities of surface particulate elemental ratios (C : N, C : P, and N : P), as well as nitrate to phosphate ratios for different latitude bands (40 • S to 40 • N; 60 • S to 70 • S; and globally). This is because dissolved and particulate elemental ratios in general show very different behaviour between lower and higher latitudes (Martiny et al., 2013a). We keep all 400 simulations because we want to obtain the sensitivity information for the full parameter ranges.

Likelihood-based metric assessing global biogeochemical model results
We consider four different types of observations for quantitatively assessing the model simulations.   Pahlow and Prowe (2010). c Keller et al. (2012). d Somes and Oschlies (2015). e Somes et al. (2017). We define our metric in terms of spatial averages of 17 distinct biogeochemical biomes, as derived and described by Fay and McKinley (2014). The individual biomes are regarded as regions of common biogeochemistry and thus account for spatial differences between ocean regions on the largest possible (global) scale. Using 56 biogeochemical provinces, as defined by Longhurst (2007), might have hampered our data-model comparison, because a higher resolution of individual regions can accentuate spatial pattern errors in tracer concentrations, resulting from model errors in advection and mixing. In our view, the biomes of Fay and McKinley (2014) are coarse enough for avoiding this problem but still sufficiently informative for identifying representative parameter values.
The underlying error model of the likelihood-based metric assumes a Gaussian (normal) distribution, which is well represented by using the first two moments of logtransformed tracer concentrations, in particular for the upper ocean layers (Schartau et al., 2017). For every depth level of the UVic model (k ∈ {1, 2, 3, . . ., 19}), average log 10transformed tracer concentrations (log 10 x) of type x are determined as spatial arithmetic means for our 17 biomes (in-dexed as j in Eq. 4) for the observations and model results: where N j k is the number of available data points within biome j in depth level k. Prior to the log 10 transformation, all tracer concentrations have been normalised to lower detection (uncertainty) thresholds (x (0) ), respectively. Measured or derived concentrations below these thresholds are treated as noise and therefore remain unresolved. Thus, the log 10transformed normalised concentrations are non-negative.
The threshold values are Our metric is derived from a likelihood, assuming a Gaussian error distribution for the residuals, which describes the discrepancy between mean values derived from observations (log 10 x (obs) ) and model simulations (log 10 x (mod) ). Hereafter, we refer to this metric as our cost function (J ). Our cost function is split up into two major parts: where d is the residual vector (see Eq. 8 below), R the covariance matrix (Eq. 9), v (obs) and v (mod) are the spatial variance estimates of the log 10 -transformed observed and modelled tracers, and V −1 are diagonal matrices with the variances (uncertainties) of v (obs) . The first part (J (u) k ) of the cost function resolves seasonal changes between the surface and 550 m depth, corresponding to the upper five depth levels of the model. The second part (J (l) k ) represents the lower depth range below 550 m and does not account for seasonal changes, as only annual mean data are available.
The residual vector (d) (whose components represent the tracer types x) used for J describes the differences between the log 10 -transformed observations and their model counterparts: where i and j are the month and biome indices, respectively. We recall that d has four components only for the UVic model's top layer (k = 1) where chlorophyll data are regarded as well. For k > 1, the residual vector contains three components: O 2 , NO − 3 , and PO 3− 4 . Both parts of the cost function (J (u) k and J (l) k ) in turn contain two terms, one with respect to the residuals, as defined in Eq. (8), and another that accounts for the differences between the spatial variances (vectors v (obs) ij k and v (mod) ij k ) within each biome (and month for J (u) k ) at each depth level. The covariance matrices R ij k account for temporal correlations (C j k ) between different variables (x (obs) ), that are specified for every biome and depth level separately: where the elements of the diagonal matrices S ij k are the standard errors of the mean log 10 -transformed tracer concentrations (log 10 x (obs) ij k ) calculated in Eq. (4) for every month i, biome j , and depth level k. For J (l) k , the R j k contain only the squared standard errors of the annual data as diagonal elements (R j k = S 2 j k ). With the consideration of standard errors instead of standard deviations, we implicitly impose weights to differences in the spatial expansion (i.e. number of data points of the gridded product used) of individual biomes. Overall, the final cost function J resolves spatial differences between regions (biomes) as well as temporal differences for those depth levels where monthly data are available. It is thus a combination of time-varying and spatial information for the assessment of our biogeochemical model results on a global scale. In order to estimate uncertainty ranges for selected model results (globally averaged N 2 fixation, NO − 3 , O 2 , DIC concentrations, NPP, NCP), we apply a bootstrap method to obtain an uncertainty quantification for our simulated values based on the 400 available ensemble model simulations. We collect the best solutions (lowest cost-function value) of 1000 randomly selected subsets of 100 out of our 400 ensemble members. The mean and 95 % confidence interval of these subsets provide an uncertainty range in the vicinity of the value of the full ensemble. Table 2 lists the ranges of selected simulated tracers and processes for the full ensemble of parameter values generated by the Latin hypercube sampling for the OPEM and OPEM-H configurations. Our results exhibit wide ranges of tracer concentrations and fluxes in these two configurations. In particular, globally averaged NO − 3 concentrations range from 10.2 to 66.2 mmol m −3 and integrated N 2 fixation from 0 to 515 Tg N yr −1 . Tracers in OPEM and OPEM-H show similar ranges, except for globally averaged NO − 3 , which ranges from 10.2 to 66.2 mmol m −3 in OPEM and 13.0 to 55.0 mmol m −3 in OPEM-H.

Biogeochemical tracer inventories and governing processes
The sensitivities of globally averaged biogeochemical properties to the variations of each of the 13 parameters in Table 2 4696 C.-T. Chien et al.: Optimality-based Earth system model -Part 2 are comparable for OPEM and OPEM-H (Fig. 1). Global mean oxygen concentration is most sensitive to ν det (remineralisation rate). Higher ν det increases oxygen consumption in shallow water, where oxygen resupply from the atmosphere is stronger. Less oxygen is consumed below the surface ocean; hence, the total oxygen inventory increases. Maximum ingestion rate (g max ) and grazing rate on ordinary phytoplankton (φ phy ) also correlate positively with oxygen. Higher g max or φ phy means more ordinary phytoplankton are grazed and less particles are formed, which then decreases oxygen consumption through remineralisation. Oxygen is less sensitive to φ dia , because the biomass of diazotrophs is much smaller than that of ordinary phytoplankton. A surprising finding is that oxygen is sensitive to, and positively correlated with, the subsistence nitrogen quota of ordinary phytoplankton (Q N 0,phy ). From a classic point of view, oxygen levels in the ocean are dominated by physical supply processes as well as biogeochemical consumption processes such as remineralisation (Feely et al., 2004). Nevertheless, in our simulations, the sensitivity to Q N 0,phy is more than half (58 %) of that to ν det in OPEM and 48 % in OPEM-H (Fig. 1). In our model, Q N 0,phy has no effect on the spatial distribution of cellular C : N ratios in phytoplankton, which is determined by ambient light and nutrient conditions. However, Q N 0,phy affects the average phytoplankton C : N ratio. The average phytoplankton C : N ratio decreases when Q N 0,phy increases, with less carbon being fixed for the same NO − 3 supply. Oxygen consumption (due to remineralisation) per mole of nitrogen thus decreases in consequence. Q N 0,phy in turn affects NO − 3 : a higher Q N 0,phy yields a higher oxygen level and hence less denitrification in oxygen-deficient zones (ODZs) and therefore leads to more NO − 3 . In fact, we identify this as a major process that controls the NO − 3 inventory in our simulations ( Fig. 1). While NO − 3 is also sensitive to other parameters, its sensitivity to Q N 0,phy is more than twice the sensitivity to any other parameter (Fig. 1).
The sensitivity of DIC is generally low, because of the relatively large DIC pool compared to the variations in fluxes among the different parameter sets. Similar to oxygen, DIC is most sensitive to ν det , Q N 0,phy , g max , and φ phy . Faster carbon recycling in the surface layer due to higher ν det generates a higher surface DIC concentration and hence more outgassing, which decreases the DIC inventory. A somewhat lower DIC inventory is also induced by a larger Q N 0,phy , as less carbon is fixed and exported per unit nitrogen in phytoplankton, and by enhanced zooplankton grazing with larger g max .
Dissolved iron (DFe) is most sensitive to the remineralisation rate (ν det ). Unlike NO − 3 , which has dynamic source (N 2 fixation) and sink (denitrification) processes, iron has a fixed source from atmospheric deposition and a sink in the sediment, and the size of the DFe pool is mainly determined by its internal cycle. A higher remineralisation rate prolongs the residence time and thus increases the DFe pool.
The parameter ν det also indirectly affects the internal DFe cycle via its effect on O 2 . While the detritus remineralisation rate drops when O 2 falls below 5 mmol m −3 (Nickelsen et al., 2015), scavenging of DFe stops below the same oxygen threshold. Detritus remineralisation rate dominates variations in DFe when globally averaged O 2 is above 135 mmol m −3 , in which case DFe is positively correlated with ν det and O 2 . When globally averaged O 2 is below 135 mmol m −3 , the widespread ODZs (below 5 mmol m −3 ) inhibit the scavenging of DFe and this effect dominates. As a result, DFe becomes anti-correlated with O 2 . Particulate iron (PFe) is also positively correlated with ν det when globally averaged O 2 is above 135 mmol m −3 , but below that PFe shows no correlation with ν det . When globally averaged O 2 is below 135 mmol m −3 , inhibition of scavenging of DFe in ODZs decreases PFe there but a higher DFe increases PFe elsewhere, because PFe is coupled to DFe through scavenging and remineralisation. As mentioned above, Q N 0,phy controls the average nitrogen quota in phytoplankton and thus in particles. Since PFe is proportional to the amount of nitrogen in particles, Q N 0,phy also affects PFe. This (positive) sensitivity is much stronger than the indirect (negative) effect via DFe leading to opposite sensitivities of DFe and PFe to Q N 0,phy . Other than ν det and Q N 0,phy , PFe is also sensitive to φ dia because dead diazotrophs enter the particulate pool (detritus) and diazotrophs are very sensitive to φ dia (Fig. 2).
The simulated global N 2 -fixation rate is sensitive to many parameters, apart from A 0,phy and Q P 0,dia . Similar relative changes in most parameters introduce changes to the global N 2 -fixation rate that are of similar magnitude. Interestingly, N 2 fixation is sensitive also to zooplankton parameters, indicating that zooplankton grazing on diazotrophs is an important factor controlling not just diazotroph biomass but also N 2 fixation.
Of particular interest are the sensitivities of global NPP and NCP. Particle fluxes in marine biogeochemical models tend to agree most closely with sediment trap data for depths of about 1000 m or below (Kriest et al., 2012). Therefore, different from Table 2, showing NCP for the upper 100 m for comparison with observations and other (reference) model simulations, here we integrate NCP from 0 to 980 m (7th layer of the ocean in the UVic-ESCM), which in steady state is equivalent to POC export flux at 980 m. NPP is sensitive to ν det and Q N 0,phy . A higher ν det causes faster nutrient recycling in surface waters, which increases NPP and reduces particle export and hence NCP. Increasing Q N 0,phy lowers both NPP and NCP, and hence also the fixed-carbon inventory. A higher ingestion rate of zooplankton (g max ) removes more particles and thus is negatively correlated with NCP. Chl is the principal agent of C fixation in the OPEM, and hence Chl has a similar sensitivity pattern to NPP except for g max and φ phy . 3.1.2 Ordinary phytoplankton, diazotrophs, particles, export, and their elemental stoichiometry First, we discuss the proportions of carbon, nitrogen, and phosphorus in ordinary phytoplankton and diazotrophs, since variations in elemental stoichiometry in autotrophs originate in differential uptake of nutrients under different environmental conditions. Globally averaged C, N, and P concentrations and ratios of globally averaged N and P of ordinary phytoplankton and diazotrophs are sensitive to ν det , Q N 0,phy , φ phy , and φ dia (Fig. 2). As expected, C, N, and P of ordinary phytoplank-ton and diazotrophs increase for higher ν det , which generates higher nutrient concentrations in the surface ocean. They are also sensitive to zooplankton grazing, especially to φ phy and φ dia . Q N 0,phy and Q P 0,phy are negatively correlated with ordinary phytoplankton C, indicating that the negative effect of higher subsistence quotas on competitive ability dominates their effect on biomass. A similar behaviour is found in diazotrophs except that Q N 0,dia is also negatively correlated with diazotroph N and hence also nitrogen fixation (Fig. 1). Although an increase in Q N 0,phy makes ordinary phytoplankton less competitive, it also raises the oceanic NO − 3 inventory, which eventually leads to more phytoplankton N (Fig. 2) and less nitrogen fixation (Fig. 1). Diazotroph C, N, and P are generally more sensitive to parameter variations than phytoplankton due to the much smaller total biomass of diazotrophs, which is also the reason why diazotrophs are less sensitive in OPEM-H, the model configuration in which their biomass is generally larger because of the growth of diazotrophs at high latitudes (see Fig. 15 in Part 1; Pahlow et al., 2020). Since ordinary phytoplankton dominate autotrophic biomass, it tends to control nutrient distributions. This explains why ordinary phytoplankton parameters such as Q N 0,phy and φ phy have strong effects on diazotrophs but not vice versa. The zooplankton grazing preferences φ phy and φ dia drive the competition between ordinary phytoplankton and diazotrophs and hence have strong and opposing effects on their biomass. Due to the relatively small total biomass, diazotroph C is more sensitive to changes in φ phy and φ dia than ordinary phytoplankton C.
Particulate C : N and N : P ratios are most sensitive to Q N 0,phy (Fig. 3). This sensitivity is related to biomass, as we see from the OPEM-H configuration, where (non-N 2 -fixing) diazotrophs are abundant at high latitudes (see Fig. 15 in Part 1; Pahlow et al., 2020) and consequently the sensitivity of high-latitude C : N to Q N 0,dia is high, even higher than to Q N 0,phy (Fig. 3). We do not find this behaviour for highlatitude regions in the OPEM configuration, as well as lowlatitude regions, where diazotrophs are not as abundant. The parameter Q P 0,phy was expected to be the most important parameter for particulate C : P ratios, just like Q N 0,phy is for the C : N ratio. However, this is only true for the OPEM at high latitudes.
At low latitudes, particulate C : P ratios are most sensitive to Q N 0,phy (Fig. 3). The supply of nitrate and phosphate at different latitudes is the major reason for this pattern. At low latitudes, the effects of Q P 0,phy are suppressed by variations in phytoplankton C, which is affected by Q N 0,phy and the consequent change in nitrate concentration. Nitrate and phosphate are not limiting in the high-latitude Southern Ocean, where, under N-and P-replete conditions, cellular C : P is mainly determined by Q P 0,phy and a higher Q P 0,phy would result in a higher cellular P : C (lower C : P). Therefore, the global C : P of total particulate matter, which is dominated by ordinary phytoplankton, is negatively correlated with Q P 0,phy . The sensitivities of dissolved N : P ratio to parameters in the three geographical settings (low latitudes, high latitudes, and global) follow similar patterns. However, we find sensitivities to be generally higher in the low latitudes, especially to variations of the phytoplankton parameters. Again, this is because NO − 3 is often limiting in lower latitudes, particularly in the oligotrophic gyres, where the dissolved nitrogen pool is more sensitive to changes in phytoplankton as well as N 2 fixation. This is also why grazing pressure on diazotrophs (φ dia ) has a much stronger effect at low than at high latitudes.

Constraining global rate estimates and inventories
The cost function (introduced in Sect. 2.2.2) was devised for identifying the best solutions among the ensemble runs. For the model's upper layers (0-550 m), observational monthly mean concentrations of nitrate and phosphate enter the cost function, thereby reflecting regional and seasonal variations in the N : P uptake ratio of ordinary phytoplankton and diazotrophs. Variations in nitrate and phosphate availability affect the growth of diazotrophs and thus determine global N 2 fixation in both OPEM and OPEM-H. In our UVic configurations, water-column denitrification is the only fixed-N loss term. Therefore, the simulated N 2 fixation is expected to match water-column denitrification under a steadystate nitrogen cycle. Nevertheless, the simulation with the lowest cost yields a global N 2 -fixation rate estimate of 40.3 Tg N yr −1 (Fig. 4a), much lower than recent estimates of water-column denitrification (55.8-72.9 Tg N yr −1 ; Somes et al., 2017;Wang et al., 2019). The cost function penalises solutions that yield N 2 -fixation rates greater than 90 Tg N yr −1 but shows no clear relation to N 2 fixation at lower rates (Fig. 4A). For example, among the simulations with the five lowest cost-function values in the OPEM configuration, the global ocean N 2 -fixation rate varies between 8 and 40 Tg N yr −1 . These model solutions also differ with respect to their O 2 inventories. The tendency of the cost function to favour very low global N 2 fixation is caused by a compensatory effect, whereby improving NO − 3 deteriorates O 2 , and vice versa (see also Part 1; Pahlow et al., 2020, and the discussion section below). Thus, instead of selecting the reference parameter sets based only on the cost function, we also take the ability to yield reasonable N 2fixation rates into account, whereby we ignore simulations with rates below 60 Tg N yr −1 , since this is the lower boundary of current data-based estimates of water-column denitrification (DeVries et al., 2012). As these solutions represent a somewhat subjective trade-off between low cost and reasonable N 2 fixation, we refer to them as trade-off solutions and details of their behaviour are shown and discussed in Part 1 (reference simulations in Pahlow et al., 2020). For OPEM, the trade-off solution corresponds to the seventh-lowest costfunction value and the fourth lowest for OPEM-H.
In the following, we will describe the lowest-cost solutions together with the trade-off solutions, as well as respective uncertainty ranges obtained from the bootstrap method described in the materials and methods section. The width of the uncertainty ranges (95 % confidence intervals) in Fig. 4 indicates the metric's ability to constrain the inventory or rate under consideration. Globally averaged N 2 -fixation rates of our trade-off solutions of OPEM and OPEM-H are just outside and within this uncertainty range, respectively (Fig. 4a). The global NO − 3 inventory turns out to be remark- ably well constrained (Fig. 4b). The mean global estimates are 30.7 and 31.3 mmol N m −3 for OPEM and OPEM-H, respectively. Ensemble solutions that deviate from these estimates have high costs, and therefore the uncertainty ranges remain narrow. The trade-off and minimum-cost solutions are hardly distinguishable. The uncertainty of the simulated global O 2 is comparable to that of the NO − 3 inventory. Global mean O 2 concentrations of OPEM and OPEM-H are 186 and 188 mmol O 2 m −3 . Our metric effectively constrains global DIC estimates, 2.290 mol C m −3 for OPEM and 2.286 mol C m −3 for OPEM-H (Fig. 4d), although DIC data have not been explicitly considered in the cost function.
While the trade-off solutions exhibit NO − 3 , O 2 , and DIC inventories well within their respective uncertainty ranges, we find somewhat larger deviations for the predicted global mean NPP (Fig. 4e). For OPEM and OPEM-H, the tradeoff solutions produce, respectively, 30 % and 14 % higher NPP than the minimum-cost solutions. The NCP (here integrated over the depth range of 0-980 m) estimates in Fig. 4F are better constrained than NPP for both configurations. The trade-off solution of OPEM corresponds to a global NCP of 1.05 Tg C yr −1 , which is close to the trade-off estimate of OPEM-H, where NCP is 1.074 Tg C yr −1 . Figure 5 shows globally averaged concentrations of O 2 versus NO − 3 of all ensemble members. The spread of the ensembles differs between the two tracers (by a factor of 2 for O 2 and by a factor of 6 for NO − 3 ). Most solutions overestimate the global average NO − 3 concentration obtained from WOA 2013 (Garcia et al., 2013a, b) and underestimate O 2 . Solutions where both tracers strongly underestimate the WOA 2013 data are penalised by the cost function (Fig. 5). The minimum-cost and trade-off solutions of OPEM and OPEM-H are close to the WOA 2013 estimates. The respective optimal solutions have slightly higher global mean O 2 concentrations than WOA 2013 and are in good agreement with respect to NO − 3 . In spite of larger costs, the trade-off solutions of both OPEM and OPEM-H are closer to the WOA 2013 estimate than the minimum-cost solutions (Fig. 5). The ensemble solutions are unevenly spread around the WOA 2013 data-based estimates. This highlights that our trade-off solutions could not have been identified had we only considered the ensemble means. simulation has similar NO − 3 and O 2 patterns to the trade-off simulation, despite the very different mean NO − 3 and O 2 concentrations. The patterns are different in the low-NO − 3 simulation because of stronger deoxygenation and denitrification, which occur mostly in North Pacific deep water. The greater similarity of global mean O 2 than NO − 3 reflects the influence of atmospheric O 2 but also indicates that NO − 3 is more sensitive to changes in the physiology of the diazotrophs.

How well can model parameters be constrained?
Cost is conspicuously correlated only with ν det , Q N 0,phy , and φ dia (Fig. 8). O 2 and NO − 3 are sensitive to ν det and Q N 0,phy but not to φ dia (Fig. 1), which indicates that φ dia becomes more important at lower-cost simulations. The minimum-cost and trade-off simulations in OPEM and OPEM-H are usually closer to each other when parameters show strong correlations with costs (Fig. 8). Figure 9 shows how different biomes contribute to the misfit and variance parts of the total cost. For simulations with high cost-function values (J > 10 10 ), we find  (Fig. 5). Low NO − 3 concentrations are coupled to low O 2 because of intense denitrification in the ODZs. Accordingly, simulations with very low NO − 3 inventories suffer from widespread ODZs, occupying much of the deep water in the northern and equatorial Pacific as well as the Indian Ocean (Fig. S1 in the Supplement). This is the main reason for the high variance in the deep water of these biomes (Fig. 9). Remineralisation rate (ν det ) and phytoplankton subsistence nitrogen quota (Q N 0,phy ) are the two parameters with the strongest correlations for most tracers as well as particulate elemental stoichiometry. The importance of ν det was expected, because it is an important driver of nutrient recycling in the surface ocean (Thomas, 2002;Anderson and Sarmiento, 1994;Eppley and Peterson, 1979), which strongly affects NPP, NCP, Chl, DIC, DFe, and N 2 fixation (Kriest et al., 2012). ν det also determines the rate of O 2 consumption, hence also the NO − 3 level, due to denitrification in ODZs (Cavan et al., 2017). The strong influence of Q N 0,phy , however, was unexpected. The subsistence quota was first introduced by Droop (1968) in phytoplankton growth models. While it has been applied in Earth system models (Kwiatkowski et al., 2018;Wang et al., 2019), a sensitivity analysis similar to the present study has not been done before. A higher Q N 0,phy implies that more nitrogen is required for phytoplankton growth, but it also can be interpreted as a lessening of carbon fixation for a given nitrogen supply. Our results demonstrate a strong effect of Q N 0,phy on NPP, Chl, and POC export (NCP, here integrated over the depth range 0 to 980 m), and consequently oxygen consumption and denitrification.
These results also put forward a new point of view on the relation between NO − 3 inventory and carbon export. In classic biogeochemistry, a larger NO − 3 inventory in the ocean stimulates primary production and POC export. This feedback is intuitive and easy to understand, as for a given C : N in phytoplankton, carbon is proportional to the nitrogen pool. This feedback is well recognised and has been widely applied in marine sciences, especially since it forms the foundation of one of the hypotheses explaining the lower atmospheric pCO 2 during the Last Glacial Maximum (LGM) (McElroy, 1983;Falkowski, 1997). However, our analysis of the model ensemble with different parameter combinations suggests another, very different, point of view. NO − 3 concentration is positively correlated with Q N 0,phy but negatively with NPP and POC export (NCP; Fig. 1), which means that an increased NO − 3 inventory can be related to a lower POC export if caused by a change in Q N 0,phy . The dynamic C : N ratio in our model explains part of this negative correlation. When the NO − 3 inventory increases due to an increase in Q N 0,phy , the nitrogen demand in phytoplankton also increases, which yields a lower C : N ratio in phytoplankton, and hence changes in carbon fixation due to increases in NO − 3 inventory remain relatively small. The increase in Q N 0,phy increases nitrogen in phytoplankton structure and decreases the C:N ratio in phytoplankton as well as detritus. The two effects together both lower POC production and raise the NO − 3 inventory. Changes in ν det also contribute to the negative correlation between NO − 3 and POC export (NCP) in our simulations: a more intense remineralisation in the surface ocean reduces POC export and thus decreases oxygen consumption and denitrification, resulting in a larger nitrate inventory.
The strong impact of Q N 0,phy on the NO − 3 inventory and globally averaged phytoplankton C : N causes a higher sensitivity of globally averaged C : N than C : P (Fig. 3). A higher Q N 0,phy results in a higher NO − 3 inventory and a lower phytoplankton C : N, both tending to lower particulate C : N, and vice versa. On the other hand, C : P is not as sensitive because we have a constant PO 3− 4 inventory in the UVic model. Surface particulate matter C : N is less variable compared to C : P and N : P in field observations along regional gradients (Galbraith and Martiny, 2015;Geider and Roche, 2002;Martiny et al., 2013a;Sterner and Elser, 2002), which is an apparent contrast to our results, where the sensitivity of C : N to Q N 0,phy is the highest among the particulate elemental ratios. However, our sensitivities are with respect to parameter variations among many simulations, rather than spatial or temporal gradients in the one real ocean.

Zooplankton parameters
While in many global biogeochemical models zooplankton are described by non-mechanistic formulations, such as Holling-type functions (Holling and Buckingham, 1976), in this study we apply a more realistic zooplankton model (Pahlow and Prowe, 2010). Among the five zooplankton parameters, the maximum specific ingestion rate (g max ) and the capture coefficients of phytoplankton (φ phy ) and diazotrophs (φ dia ) are the most important, whereas the preference for detritus (φ det ) is generally less important. Grazing on zooplankton itself (φ zoo ) counters the effect of g max because it lowers zooplankton biomass and thus total ingestion. These parameters together dominate controls on N 2 fixation and Chl (Fig. 1), and C, N, and P of ordinary phytoplankton and diazotrophs (Fig. 2). It is interesting that zooplankton parameters also exert some control on particulate N : P as well as the dissolved nutrient pools (Fig. 3). This can be understood via their controls on N 2 fixation and the ensuing changes in N : P in the dissolved and particulate pools.

Other parameters and the OPEM-H configuration
Other parameters in the sensitivity analysis appear less important for the tracer distributions, but this does not necessarily mean that they are negligible. Specific mortality rate (λ 0,phy ) and the phytoplankton half-saturation constant for Fe (k Fe,phy ) do contribute to some variations of most of the tracers (Fig. 1), and particulate C : P is somewhat sensitive to potential nutrient affinity (A 0 ). Phytoplankton subsistence P quota (Q P 0,phy ) affects major tracers much less than phytoplankton subsistence N quota (Q N 0,phy ), but it is still important for particulate C : P and particulate N : P ratios, particularly at high latitudes and globally (Fig. 3). Diazotroph subsistence N and P quotas (Q N 0,dia and Q P 0,dia ) in general have much less influence on particulate stoichiometry than Q N 0,phy and Q P 0,phy because diazotrophs are much less abundant than ordinary phytoplankton. However, diazotroph biomass (carbon) itself is more sensitive to Q N 0,dia than Q N 0,phy , which shows that the diazotroph subsistence quotas are still important for both their elemental stoichiometry and ability to compete with ordinary phytoplankton. While elemental stoichiometry has been suggested to be an important factor for determining the outcome of the competition between diazotrophs and non-diazotrophs, and consequently N 2 fixation Weber and Deutsch, 2012), we find that N 2 fixation is no more sensitive to Q N 0,dia than to the remineralisation rate (ν det ), Q N 0,phy , or zooplankton grazing parameters (g max , φ phy , and φ dia ). Nevertheless, our analysis agrees with the argument that global N 2 fixation is mainly determined by rates of fixed-N loss (Weber and Deutsch, 2014), which in our model is largely affected by ν det and Q N 0,phy . In general, tracer sensitivities to parameters in OPEM-H configuration are similar to those in OPEM. O 2 and NO − 3 levels are slightly less sensitive to the remineralisation rate, Q N 0,phy , and g max in OPEM-H because this configuration allows (facultative) diazotrophs to grow in high-latitude cold waters; hence, the overall biomass of diazotrophs is greater (Part 1; Pahlow et al., 2020). This is also the reason why Q N 0,dia and Q P 0,dia exert a stronger effect on surfaceparticle elemental stoichiometry at high latitudes in OPEM-H (Fig. 3).
Several studies have revealed that N 2 fixation occurs in high-latitude regions (Sipler et al., 2017;Harding et al., 2018;Shiozaki et al., 2018;Mulholland et al., 2019), which supports a wider temperature range of N 2 fixation, similar to what we have in OPEM-H. In the trade-off simulation for OPEM-H, we do find some N 2 fixation in the eastern North Pacific and the Arctic Ocean (Part 1; Pahlow et al., 2020). The different temperature function for diazotrophy is also the reason for the differences in the sensitivities of particulate C : N : P to diazotroph subsistence quotas in high-latitude regions (Fig. 3).

Model limitations
The strong correlation between O 2 and NO − 3 (Fig. 5) indicates that O 2 and denitrification are tightly coupled. Lack of benthic denitrification leaves water-column denitrification as the only loss of NO − 3 and O 2 becomes the primary factor controlling the NO − 3 inventory. This implies that sensitivities of NO − 3 to the model parameters could be different when benthic denitrification is incorporated in our model. Also, this means that global N 2 fixation (same as global denitrification in our spun-up steady-state simulations) is underestimated, and since it occurs mostly at 40 • S to 40 • N (see Fig. 13 in Part 1; Pahlow et al., 2020), particulate carbon to nitrogen (C : N) ratios could be overestimated due to a missing input of nitrogen to the surface ocean. This could explain the overestimated surface particulate C : N at low latitudes (see Table 3 and Fig. 16 in Part 1; Pahlow et al., 2020).
To evaluate how water-column denitrification affects our cost function, we arrange our simulations in the order of their cost values and plot the volume of ODZs against cost for both the OPEM and OPEM-H configurations in Fig. 10a-c. Several of our simulations, mostly among those with the 200 lowest cost values (Fig. 10a), have a relatively small misfit in O 2 and NO − 3 compared to the WOA 2013 estimate, and high N 2 -fixation rates, comparable to those estimated in previous model studies (e.g. Somes et al., 2017;Wang et al., Figure 8. Lower parts (cost < 10 8.2 ) of cost-value distributions for the parameter ranges in Table 1. Solid red triangles and blue circles represent the minimum-cost simulations in OPEM and OPEM-H, respectively, and open red triangles and blue circles are the trade-off simulations. Note that the trade-off simulations share the same parameter combination but have slightly different cost-value distributions. 2019). For these simulations, low O 2 is connected with high rates of water-column denitrification in the eastern equatorial Pacific Ocean (Pac.EQU.E), causing a depression of NO − 3 concentration and a rather high variance in NO − 3 concentration, both of which conflict with the observations. Hence, cost in this biome is very high, especially in the upper 550 m (Fig. 9), where denitrification is strongest. On the other hand, although the volume of ODZs in the minimum-cost simulations in OPEM and OPEM-H is greater than in WOA 2013 (Fig. 10c), they yield rather low N 2 -fixation rates (40.3 and 35.0 Tg N yr −1 for OPEM and OPEM-H, respectively). ODZ volumes in the trade-off simulations are more than twice that in WOA 2013 (Fig. 10) and yield global N 2 -fixation rates close to current estimates of water-column denitrification (about 70 Tg N yr −1 ; Somes et al., 2017;Wang et al., 2019). The mismatch between ODZ volume and N 2 -fixation rate indicates that a refined description of water-column denitrification may be needed (Sauerland et al., 2019). While the physical component (ocean circulation) of the UVic model is also very important for the global distribution of oxygen and nitrate, our results suggest that, clearly, only by considering all major nitrogen sources and sinks, such as atmospheric deposition and benthic denitrification, a better representation of N 2 fixation and the global marine nitrogen cycle can be achieved. The cost function introduced above is a metric that quantifies the discrepancy between objectively analysed observational data and simulation results. Our cost function proves useful for exploring the 400 ensemble model solutions and identifies model solutions that reproduce deep ocean gradients in the NO − 3 : PO 3− 4 ratio better than a classic fixed-stoichiometry model (Part 1; Pahlow et al., 2020). In addition, the optimal model solutions yield improved NCP rate estimates integrated over the top 100 m (Part 1; Pahlow et al., 2020). In particular, the trade-off solutions of OPEM and OPEM-H can resolve observed latitudinal patterns in dissolved and particulate C : N : P within the upper productive ocean layers (0-130 m; see Part 1; Pahlow et al., 2020). The consideration of monthly mean O 2 , NO − 3 , PO 3− 4 data for the upper 550 m and surface Chl remote sensing data introduces important constraints on the representation of the relation be- tween light and nutrient limitation, thereby also specifying the degrees of N and P limitation.
Even within the 5 % of the simulations with the lowest costs, the estimates of global N 2 -fixation rate vary considerably. The mean global estimates ± standard deviation in OPEM and OPEM-H are 32 ± 20 and 39 ± 18 Tg N yr −1 , respectively. We initially expected that the NO − 3 and PO 3− 4 data in the cost function would effectually constrain N 2 fixation. This is clearly not the case and additional information has to be considered. One explanation may be that considerable N 2 fixation can occur during short periods and may also be confined to regions smaller than the biomes. Regional differences with respect to N 2 fixation remain unresolved if only biome-specific monthly mean NO − 3 and PO 3− 4 data are considered for the upper layers in the cost function.
Also, the minimum-cost solution yields very low global N 2 -fixation rates. Thus, for the identification of the trade-off solutions, we had to consider prior information about global water-column denitrification, whose rate is balanced by N 2 fixation according to our models. Incorporating N 2 fixation as a single global rate estimate into our Likelihood-based cost function as a single additional term would, without some difficult-to-define regularisation, become overwhelmed by the many tracer and variance terms defined in Eqs. (6) and (7). Rather, the additional information is treated as a second objective, namely that global N 2 fixation should be greater than 60 Tg N yr −1 (see above), which is similar to applying a multi-objective approach for model calibration (e.g. Sauerland et al., 2019), where a trade-off between two or more objectives (cost functions) is resolved. A refined cost function may incorporate monthly mean N : P ratios or N* values based on WOA 2013 data (e.g. for the upper 130 m) for clustered subregions of some biomes. Such addition to the cost function would require some careful preprocessing, e.g. cluster analysis of the spatial N : P or N* patterns but may suffice to constrain simulated N 2 -fixation rates.
A peculiarity of our cost function is that it complements the data-model misfit, i.e. the residuals of spatial mean log 10transformed values, with an additional term that resolves differences in spatial variances. How the neglect of this term affects the global mean tracer concentrations and flux estimates is depicted in Fig. S2-S7 in the Supplement. The cost function's variance term introduces a strong penalty to approximately 30 % of all ensemble model solutions. The highest cost-function values (J > 10 9 ) are associated with discrepancies in spatial variances that exceed the misfits in the log 10 -transformed tracer concentrations. For large parts of the ensemble solutions, the variance term contributes between 15 % and 20 % to the total costs. Interestingly, for those model solutions that yield low cost-function values (J < 4 × 10 7 ), the relative contribution rises again when the misfit in the log 10 -transformed tracer concentrations gradually decreases (Fig. 10b).

Contributions of biomes
The 17 biomes derived by Fay and McKinley (2014) represent a scale similar to that addressed in global efforts to establish surface-ocean air-sea carbon-flux estimates (Wanninkhof et al., 2013;Rödenbeck et al., 2015). Accordingly, our cost function can be easily extended by incorporating airsea CO 2 flux estimates in the future. Further improvements may be possible by introducing subregions in some biomes, e.g. for constraining N 2 -fixation rate estimates, as discussed above. For low cost-function values, the contribution of the variance term is generally small in most biomes for the deep layers (Fig. 9), where variances of the log 10 -transformed tracer concentrations compare very well between the simulations and the WOA 2013 estimate. For high costs, this term can become dominant, e.g. for some biomes in the North Pacific as well as the Indian Ocean. A remarkable exception is the North Pacific Arctic biome (NP-ICE), where the deep layer's variance term remains dominant for most of the ensemble solutions. This is somewhat different in the Arctic biome of the North Atlantic (NA-ICE) and the Southern Ocean (SO-ICE), where the variance term remains low throughout almost the entire ensemble. For SO-ICE, the cost function is mainly affected by the misfit in log 10 -transformed tracer concentrations. The misfit is associated mainly with discrepancies between observed and simulated NO − 3 within the SO-ICE biome. Interestingly, these misfits in both upper and deeper layers drop again after around the 280th simulation. Simulations with high NO − 3 do not result in total cost values as high as in simulations with very low NO − 3 (Fig. 5), but they have larger misfits for NO − 3 in SO-ICE. A similar behaviour can be seen in the other Southern Ocean biome (SO-SPSS) as well as in NA-ICE.
The upper layer's variance term contributes strongly to low costs in North Atlantic biomes. This is particularly striking for the equatorial Atlantic biome (Atl.EQU). The main reason is water-column denitrification that results in a high variance in NO − 3 . Likewise the eastern equatorial Pacific biome (Pac.EQU.E) reveals major model limitations in the upper layers. Overall, the unfolding of biome-specific contributions to the cost function clearly points to those regions where improving model performance appears most worthwhile. Our present cost function may then be reapplied to quantify and highlight specific model improvements.

Conclusions
We demonstrate sensitivities of various tracers and processes to parameters in two configurations of a new optimalitybased plankton-ecosystem model (OPEM) in the UVic-ESCM. While OPEM-H predicts a wider geographical range for N 2 fixation (Part 1; Pahlow et al., 2020) and shows some differences in the sensitivities of diazotroph C, N, and P to parameters when compared to OPEM, the tracer sensitivity to model parameters is very similar in both configurations. The trade-off simulations in the OPEM and OPEM-H happen to have the same parameter set. Among our model simulations, varying model parameters within reasonable ranges results in variations in O 2 by a factor of 2 and in NO − 3 concentration by a factor of 6. The sensitivity analysis provides important information regarding the new models' behaviour. The O 2 inventory is mainly influenced by the remineralisation rate (ν det ) as well as phytoplankton subsistence nitrogen quota (Q N 0,phy ) and zooplankton maximum specific ingestion rate (g max ). Changes in Q N 0,phy strongly impact the NO − 3 inventory, as well as the elemental stoichiometry of ordinary phytoplankton, diazotrophs, and detritus. Q N 0,phy also affects N 2 fixation, Chl, DIC, and iron levels. Furthermore, our sensitivity analysis resolves correlations between various biogeochemical tracers. For example, POC export is negatively correlated with the NO − 3 inventory. We would like to point out that these changes in model behaviour are solely caused by variations in parameters. Thus, the correlations between tracers and rates might not stand when tracer variations are caused by other factors. For example, an increase in the NO − 3 inventory due to anthropogenic emissions may be accompanied by an increase in POC export (Fernández-Castro et al., 2016). Also, although we did evaluate sensitivities of particulate elemental stoichiometry at different latitudes, most tracer sensitivities and correlations should be considered valid only for global but not regional scales.
We introduce a new likelihood-based metric for model calibration. The metric appears capable of constraining globally averaged O 2 , NO − 3 , and DIC concentrations as well as NCP. In particular, the minimum-cost and trade-off model solutions resolve observed latitudinal patterns in particulate C : N : P within the surface layers (0-130 m). However, the metric does not effectually constrain the models' global N 2fixation rate estimates. Individual contributions of the biomes to the cost function provide details of how tracer distributions in each biome respond differently under different ecosystem settings. The consideration of spatiotemporal variations in the stoichiometry of NO − 3 , PO 3− 4 , and O 2 in our metric favours model solutions with low N 2 -fixation rates that are solely balanced by low rates of water-column denitrification. From our findings, we conclude that an explicit consideration of benthic denitrification and atmospheric deposition seems critical for improving the representation of the complete global nitrogen cycle in our model.  (Pahlow, 2020). The instructions needed to reproduce the model results described in this article are in the Supplement.
Author contributions. CTC and MP performed the ensemble solutions and selected the reference simulations. MS set up the likelihood-based metric. All authors contributed to the manuscript text.