A coupled pelagic–benthic–sympagic biogeochemical model for the Bering Sea: documentation and validation of the BESTNPZ model (v2019.08.23) within a high-resolution regional ocean model

The Bering Sea is a highly productive ecosystem, supporting a variety of fish, seabird, and marine mammal populations, as well as large commercial fisheries. Due to its unique shelf geometry and the presence of seasonal sea ice, the processes controlling productivity in the Bering Sea ecosystem span the pelagic water column, the benthic sea floor, and the sympagic sea ice environments. The Bering Ecosystem Study NutrientPhytoplankton-Zooplankton (BESTNPZ) model has been developed to simulate the lower-trophic-level processes throughout this region. Here, we present a version of this lower-trophic-level model coupled to a three-dimensional regional ocean model for the Bering Sea. We quantify the model’s ability to reproduce key physical features of biological importance as well as its skill in capturing the seasonal and interannual variations in primary and secondary productivity over the past several decades. We find that the ocean model demonstrates considerable skill in replicating observed horizontal and vertical patterns of water movement, mixing, and stratification, as well as the temperature and salinity signatures of various water masses throughout the Bering Sea. Along the data-rich central portions of the southeastern Bering Sea shelf, it is also able to capture the mean seasonal cycle of primary production. However, its ability to replicate domain-wide patterns in nutrient cycling, primary production, and zooplankton community composition, particularly with respect to the interannual variations that are important when linking variation in productivity to changes in longer-lived upper-trophic-level species, remains limited. We therefore suggest that near-term application of this model should focus on the physical model outputs, while model development continues to elucidate potential mechanisms controlling nutrient cycling, bloom processes, and trophic dynamics.


Introduction
The Bering Sea is a highly productive ecosystem. Its broad, shallow eastern shelf reaches widths of over 500 km, with an average depth of only 70 m, leading to a long growing season and high annual primary production (Rho and Whitledge, 2007). Tidal mixing along the continental shelf break also leads to a highly productive off-shelf region, often referred to as the "Green Belt", where the confluence of nitrate from the deep basin and iron from the shelf are mixed into the euphotic zone (Springer et al., 1996). This high primary productivity across the shelf and slope in turn supports a wide variety of pelagic and benthic predators, which support fisheries that land nearly half of the US annual catch (National Marine Fisheries Service, 2017;Fissel et al., 2017).
Because of the Bering Sea's economic and cultural importance, changes in its ecosystem have prompted a series of contemporary research efforts, including the Bering Ecosystem Study (BEST) and Bering Sea Integrated Ecosystem Research Program (BSIERP), that aimed to advance understanding of ecosystem processes and their relationship to the physical environment in the Bering Sea (Sigler et al., 2010). As part of these efforts, the Bering Ecosystem Study Nutrient-Phytoplankton-Zooplankton (BESTNPZ) biogeochemical model (Gibson and Spitz, 2011) was developed to simulate the key processes and features of the Bering Sea lower-trophic-level ecosystem, including primary and secondary production in the pelagic environment as well as benthic-pelagic and ice-pelagic interactions.
A regional ocean model that includes the BESTNPZ biological model has been used to investigate a variety of topics, including historical and future biophysical variability (Hermann et al., 2013, ecosystem status and variability , fish advection and recruitment (Wilderbuer et al., 2016), and community connectivity within crab populations (Richar et al., 2015); at least a dozen ongoing projects continue to rely on this model for retrospective and future analyses. Previous studies have examined the model's skill and sensitivity broadly. Hermann et al. (2013) demonstrated that the physical model shows skill in replicating observed patterns in temperature, salinity, and circulation, while Gibson and Spitz (2011) performed a thorough sensitivity analysis of the biogeochemical model in a onedimensional environment representing the mid-shelf portion of the southeast Bering Sea shelf. However, a comprehensive evaluation of the biogeochemical skill of the BESTNPZ model in the three-dimensional ocean modeling context has been lacking.
Here, we present a thorough documentation of the BEST-NPZ biogeochemical model in its current state. We also provide context and history for the various versions of the code that were used in the aforementioned publications and the changes that were made between publications and since. Finally, we evaluate several aspects of lower-trophic-level output of the BESTNPZ model within a high-resolution regional ocean model, focusing on whether the model properly captures key biophysical and biogeochemical processes necessary to realistically simulate primary production in the Bering Sea. This skill assessment reveals the model's strengths and weaknesses in reproducing historical patterns across the entire Bering Sea domain and also serves as a baseline to which further model improvements can be compared.
2 An overview of the BESTNPZ model

Model structure
The biogeochemical and ecosystem model underlying this study uses a traditional nutrient-phytoplankton-zooplankton structure. It tracks a total of 19 state variables: 14 pelagic variables, 2 benthic variables, and 3 sympagic (ice) variables.
The 14 pelagic state variables are resolved as tracer variables within the physical model and are therefore subject to advection and diffusion. The nutrients include nitrate, ammonium, and iron. Two size classes of phytoplankton (small and large) and five zooplankton groups (microzooplankton, small copepods, large copepods, euphausiids, and jellyfish) comprise the living state variables in the model. Both the large copepods and euphausiids groups are further subdivided into two state variables each, with parameterizations tailored to on-shelf and off-shelf populations; at present, the parameterizations for these two groups are very similar. Two detrital state variables, representing fast-and slow-sinking detritus, are also included.
The benthic submodel includes a living benthos group, encompassing all live infauna, and a single benthic detritus group. These state variables do not include any horizontal or vertical movement.
The three sympagic state variables are associated with seasonal sea ice and include nitrate, ammonium, and ice algae. These variables are assumed to occupy a thin skeletal layer on the underside of sea ice (when present); their horizontal movement is determined by the movement of the ice in which they reside, and they exchange material with the top layer of the ocean model. The exact thickness of the skeletal ice layer is specified via a user input parameter (see Sect. A3 for further details); for all simulations to date, a value of 0.02 m was used.
Nitrogen is used as the primary currency throughout the model, with all living and detrital state variables assumed to have a constant stoichiometry of 106 moles carbon per 16 moles nitrogen. While iron is also included as a state variable for primary production limitation purposes, its flux through the ecosystem is not tracked beyond its uptake during primary production, and water column iron is restored to a simple empirical distribution based on water depth on an annual timescale. Due to a quirk inherited from its predecessor model (Hinckley et al., 2009), many of the model's output variables, including phytoplankton, zooplankton, and detrital biomass variables as well as all flux rate diagnostic variables, are reported in carbon-based units; this conversion uses a constant N : C ratio across all state variables.
For a full description of all state variables, process equations, and input parameters in the BESTNPZ model, please see Appendix A.
In its three-dimensional setup, the BESTNPZ ecosystem model is coupled to an implementation of the Regional Ocean Modeling System (ROMS), a free-surface, primitive equation hydrographic model (Shchepetkin and McWilliams, 2005;Haidvogel et al., 2008). The Bering10K ROMS domain spans the Bering Sea and northern Gulf of Alaska, with 10 km horizontal resolution (Hermann et al., 2013). To date, it has been run with either 10 (previous studies) or 30 (this study) terrain-following depth levels (Fig. 1); we discuss this increase in vertical resolution in the next section. This horizontal domain is a subsection of a larger-domain ROMS model of the northeast Pacific (NEP5) (Danielson et al., 2011); both domains have been shown to capture the primary physical features important to biological processes, including circulation patterns, temperature, and salinity, as well as the seasonal advance and retreat of sea ice.

History and modifications
The BESTNPZ model has undergone several phases of development over the past several years. The code for the pelagic system originated from a Gulf of Alaska NPZ model known as GOANPZ (Hinckley et al., 2009). Model parameters and equations were tailored to the Bering Sea ecosystem during the Bering Ecosystem Study and Bering Sea Integrated Ecosystem Research Project, and the benthic and sympagic modules were added to the code during this phase. In the earliest publication of the model, Gibson and Spitz (2011) analyzed its sensitivity to input parameter uncertainty when coupled to a one-dimensional ocean model representing the Bering Sea M2 mooring location (56.87 • N, 164.06 • W). While this study confirmed that large-scale properties, such as modeled annual net primary production, were in line with observations from the eastern shelf region of the Bering Sea, it did not present any detailed skill analysis of the biological state variables against observations. Following the Gibson and Spitz (2011) study, the BESTNPZ model was embedded within a realistic threedimensional ocean model for the Bering Sea, referred to herein as the Bering10K ROMS domain. In a study focusing on physical and biological modes of variability with the Bering10K-BESTNPZ model output, Hermann et al. (2013) demonstrated that the model physics skillfully replicated ob-served patterns in temperature and salinity on the Eastern Bering Sea shelf, and that nutrients, phytoplankton, and zooplankton in this shelf region covaried with each other and with physical properties in a manner that supported the existing hypotheses of energy flow in the Eastern Bering Sea. However, Hermann et al.'s (2013) analysis focused on interannual variability and did not specifically assess the skill of the model to replicate seasonal patterns in primary and secondary productivity, nor did it address the behavior or skill of the model away from the eastern shelf. A few minor input parameter adjustments were made for the Hermann et al. (2013) study versus the earlier Gibson and Spitz (2011) study; apart from this and their differing one-dimensional versus three-dimensional physical environments, the versions of the BESTNPZ model used in these two publications are the same.
A third modification of the BESTNPZ model appears in Hermann et al. (2016), where an updated version of the Bering10K-BESTNPZ model complex seen in Hermann et al. (2013) was used to investigate long-term change in the Bering Sea under various climate change scenarios. In this version of the model, the light attenuation scheme was modified, and the light-related input parameters were adjusted. Several subsequent studies have used and continue to make use of the hindcast and/or climate forecast scenario output that was produced with this version of the Bering10K-BESTNPZ model Hermann et al., 2019). Pilcher et al. (2019) also used this version of the model, with the addition of carbon variables, to support their analysis of ocean acidification in the Bering Sea.
Following the Hermann et al. (2016) publication, the BESTNPZ source code underwent a thorough revision. The revision process revealed a handful of small but consequential issues in the implementation of the BESTNPZ process equations that have led to a new version of the BESTNPZ model. These issues included the following: 1. Lack of conservation of biomass in large copepod biomass. An error in the code governing vertical migration of large copepods led to a slow but steady post-diapause accumulation of offshore large copepod (NCaO) biomass below their diapausing depth of 400 m. Over the course of a multi-year simulation, this deep biomass manifested itself as a "fall bloom" at depth and reached levels surpassing that of surface large copepod biomass; this artifact is visible in the depth-integrated biomass results presented in . This error has now been corrected.
2. High light limitation. Preliminary validation of simulated phytoplankton production revealed that light was strongly limiting in certain regions of the model domain even under noon summer conditions. In deep water, the 10-layer physical model setup was too coarse to resolve variations in light within the mixed layer, leading to light-limited conditions year round. In the shallower regions of the inner shelf, an overly aggressive sediment attenuation term -introduced between the Hermann et al. (2013,2016) versions of the modelalso led to year-round light limitation. This high light limitation masked problems with micronutrient limitation in the deep basin and macronutrient limitation in the inner coastal domain. To remove this excessive light limitation, new runs of the Bering10K domain now use 30 vertical layers rather than 10, and a new equation for light attenuation was implemented. See Appendix B for further discussion of these changes to light attenuation.
3. Non-conservative behavior of macronutrients related to nudging. Gibson et al.'s (2011) version of the BEST-NPZ model implemented empirical relaxation of iron, ammonium, and nitrate throughout its one-dimensional domain; both nitrate and iron were relaxed towards seasonal climatological profiles, while ammonium was relaxed toward zero, all on annual timescales. The relaxation was appropriate in the one-dimensional context, given the use of periodic lateral boundary conditions that did not allow for any advective transport of nutrients into or out of the domain. However, when moved to the three-dimensional Bering10K domain (as was done in all previous publications, including Hermann et al., 2013Hermann et al., , 2016, this nudging becomes inappropriate, as all processes controlling nutrient distribution should conserve biomass (total nitrogen in the system is not constant, due to the open lateral boundary conditions and out-of-system fluxes from burial and benthic denitrification, but these changes to the nitrogen budget can be attributed to known processes and quantified). Particularly in the case of ammonium relaxation, the nudging applied to the three-dimensional domain introduced a phantom process that "scavenged" ammonium from the water column across most of the shelf regions. In the most recent versions of the model, nudging has been removed from the NO 3 and NH 4 state variables; it remains in place for Fe, since the simplified iron dynamics in the current model do not supply any explicit sources of bioavailable iron.
4. Preferential uptake of nitrate in high-nitrate, highammonium conditions. Under the previous governing equations for macronutrient limitation during the gross primary production calculations, the total nitrogen limitation factor (i.e., the factor applied to the maximum photosynthetic uptake rate to account for nutrient limitation, after Lomas and Glibert, 1999) could exceed 1.0 under high-nitrate, high-ammonium conditions; the code implemented a cap to counter this by reducing the ammonium limitation factor so the sum of the nitrate and ammonium limitation factors was less than 1. This approach led to reduced uptake of ammonium in favor of nitrate when concentrations of both macronutrients were high, and is counter to the assumption that ammonium uptake is usually energetically favored over nitrate uptake, and that high levels of ammonium inhibit nitrate uptake (Glibert et al., 2016, and references therein); it is also counter to the ammonium inhibition encoded in the nitrate limitation factor itself in this model. While this quirk in nutrient uptake in unlikely to have made a large difference in uptake dynamics given that it was only relevant during nutrientreplete conditions, it may have exacerbated the accumulation of ammonium on the shelf seen when nudging was removed. In the most recent version of the code, nitrate and ammonium limitation factors were modified to use an equation (after Frost and Franzen, 1992) that constrains the total nitrogen limitation factor to a range of 0-1 without the need for additional caps.
5. Euphausiid prey preferences. In previous versions of the BESTNPZ model, the euphausiid groups preyed upon large phytoplankton, ice algae, microzooplankton, and small copepods. However, the modeled euphausiid populations tended to drop precipitously in winter months when these prey were scarce, in contrast to fish diet data that indicate the continued presence of euphausiids in fish diets on the southeastern shelf during the winter (Livingston et al., 2017). This observation is important to replicate when using the BESTNPZ model to look at the broader flow of energy to higher-trophiclevel species (e.g., . In an attempt to increase overwintering success of the on-shelf euphausiid group, feeding links to the two detrital groups and to small phytoplankton were added. The addition of these feeding links for the onshore euphausiid group distinguishes the tendency of on-shelf Bering Sea euphausiids, dominated by Thysanoessa raschii, to rely on detrital feeding for overwinter survival; in contrast, the shelf-edge population, dominated by Thysanoessa inermis, accumulates higher lipid stores to support winter survival .
In addition the changes to the biogeochemical model listed above, significant biases in the ice fields produced by earlier versions of Bering10K and related models have been identified and addressed by ROMS colleagues (Durski andKurapov, 2019, Kate Hedstrom andAlexander Kurapov, personal communications, May 2016). Late melting of ice was noted by Danielson et al. (2011) andCheng et al. (2014) for a northeast Pacific model that utilizes nearly the same ice code as Bering10K, and by  and Hermann et al. (2013Hermann et al. ( , 2016 for the Bering10K model itself. Corrections to the longwave radiation terms of ice thermodynamics have been implemented in the latest version of the Bering10K code to address the late ice melting bias. Additional improvements to the mechanical ice dynamics by Hedstrom (Kate Hedstrom,personal communication, 2016) corrected for a previous bias towards excessive ice thickness in some areas. Specifically, this was corrected through the addition of a quadratic ice strength term that resists ice ridging, based on the work of Overland and Pease (1988).

Model configuration and forcing
We ran two simulations of the Bering10K-BESTNPZ model for this study. For both simulations, the model is driven by surface atmospheric forcing from either the Common Ocean Reference Experiment (CORE) (Large and Yeager, 2009) (1970-1994, the Climate Forecast System Reanalysis (CFSR) (Saha et al., 2010-March 2011, or the Climate Forecast System Operational Analysis (CFSv2-OA) (April 2011-September 2018); bulk formulae were used to relate winds, air temperature, specific humidity, surface pressure, and shortwave and longwave radiation from these datasets to surface stress, freshwater, and heat fluxes (Fairall et al., 1996). Comparison between overlapping years from the CORE and CFSR datasets revealed small differences in values in the radiation variables; the CORE shortwave and longwave radiation values were therefore divided by factors of 0.9 and 0.97, respectively, to align with the CFSR data (note that this is the opposite of the adjustment performed in previous studies, e.g., Hermann et al. (2013Hermann et al. ( , 2016, where the CFSR values were adjusted downward by 10 % and 3 %, respectively). Lateral boundary conditions for the open southern and eastern boundaries of the model domain use a hybrid nudging-radiation scheme (Marchesiello et al., 2001). During the CORE period, these boundary conditions were derived from a simulation of the larger northeast Pacific grid (Danielson et al., 2011), which relied on the Simple Ocean Data Assimilation (SODA) dataset (Carton and Giese, 2008) for its own lateral boundary conditions; the CFS periods use CFS ocean variable values. The northern boundary transport through the Bering Strait is relaxed to the observed value of 0.8 Sv (Woodgate and Aagaard, 2005); earlier sensitivity studies tested whether a seasonally varying open boundary condition could better replicate the flow patterns in the northern portion of the domain, but the simple relaxation condition was found to perform equally well. Freshwater runoff due to river input was reconstructed from observed river discharge from Alaskan and Russian rivers (Kearney, 2019); river freshwater input is distributed across model grid points near the coast as a surface freshwater flux based on river mouth location, with an e-folding scale of 20 km.
The first model simulation, referred to hereafter as the spinup simulation, looped interannually invariant surface and lateral boundary conditions over a 30-year period. We chose to use boundary condition values from 2001, a year with close to average sea ice cover. Physical variables for this simulation were initialized from the 1 January 2001 values of a previous hindcast of the Bering10K domain. Nitrate was initialized to a constant value of 40 mmol N m −3 below 300 m, transitioning to 0 mmol N m −3 at 100 m. Iron was initialized to the same empirical profile used for annual nudging within the model, which sets surface and deep iron values based on bottom depth (see Appendix A). All living biological state variables (i.e., phytoplankton, zooplankton, and benthic infauna) were initialized using a tiny seed value to allow future growth, while all other state variables were initialized at zero. The purpose of this simulation was to allow the model to reach an internally regulated state, and to verify that any accumulation of nitrogen outside the deep basin was a result of internal dynamics rather than overestimated initial conditions.
The second simulation, referred to as the hindcast simulation, was initialized using values from the final time step of the spinup simulation. The model was then run from 1970 to 2018, forced with the full time series of surface and lateral boundary conditions from the combined CORE/CFS-derived dataset described above.

Key features of biological importance
To systematically validate the Bering10K-BESTNPZ model complex, we focus on a few key features of the Bering Sea. We begin by looking at physical processes that are known to influence the primary production and then compare our modeled patterns of primary production, phytoplankton biomass, and phytoplankton and zooplankton community composition to a variety of measurements. We primarily focus this evaluation on the Eastern Bering Sea shelf due to the availability of data in this region but also qualitatively evaluate the patterns seen in the central basin and the northern shelf regions. While our physical model domain extends into the northern Gulf of Alaska, the biological model was never intended to simulate this region, and for this validation we assume that all regions south of the Aleutian Islands or east of the Alaska Peninsula are outside the biological domain of the BESTNPZ model.

Sea ice
Sea ice plays a key role in shaping the ecosystems of the Bering Sea. Ice advances southwestward through the Bering Strait into the Bering Sea, driven both by winds from the northeast and local ice formation, with much of the eastern shelf at least partially covered by ice beginning in early winter (October to November) through early spring (March to April). Variability in the timing of ice onset and retreat and extent of sea ice can be significant year to year, influenced by winds and air temperature related to the position and strength of the Aleutian Low and Siberian High pressure systems, as well as ocean conditions (Stabeno et al., 2001).
To analyze the extent and timing of the updated seasonal sea ice, we collected estimates of sea ice concentration derived from Nimbus-7 SMMR and DMSP SSM/I-SSMIS passive microwave data (Cavalieri et al., 1996) for the period of 1980-2018. For comparison with model output, satellite-derived fraction ice cover was interpolated from its native 6.25-20 km resolution polar stereographic grid to the Bering10K model grid via a nearest-neighbor method. For both model and observations, we calculate the location of the ice edge along 170 • W (the approximate longitudinal center of the seasonal sea ice in the Bering Sea), defining the edge as the southern extent of a continuous block of grid cells where all cells have at least 15 % ice cover.

The cold pool
As ice advances southward along the Bering Sea shelf, the freezing process and resulting brine rejection leads to the formation of cold, salty, dense bottom water underneath the ice (Stabeno et al., 2001); continuous freezing in the vicinity of the St. Lawrence polynya further intensifies the formation of this bottom water mass (Danielson et al., 2006). The resulting cold bottom water is referred to as the "cold pool" and is typically defined as waters colder than either 0 or 2 • C. In the spring, warming of the surface waters coupled with melting of sea ice sharply stratifies the water column over the middle shelf region, isolating the cold pool waters from surface heating and mixing. As a result, this signature water mass can persist well into the summer months (Stabeno et al., 2001). Due to the isolation of the cold pool, this water can serve as a nutrient reservoir to the mid-shelf region when mixed with nutrient-depleted surface waters during storm events following the initial spring bloom (Sambrotto et al., 1986;Aguilar-Islas et al., 2007). In addition, the location of the cold pool influences the spatial distribution (Mueter and Litzow, 2008;Stabeno et al., 2012) and recruitment of higher-trophic-level predators (Mueter et al., 2011;Duffy-Anderson et al., 2016) through various mechanisms.
Measurements of bottom temperature are collected each summer as part of the Bering Sea Groundfish Bottom Trawl Survey. Net trawls are conducted at fixed survey stations located across the Eastern Bering Sea shelf at 20 nautical mile resolution. From 1982 to 1989, temperature data were collected via expendable bathythermographs (XBTs). More recent surveys use digital bathythermograph recorders attached to the headrope of the bottom trawl net (BRANCKER RBR XL-200 Micro BTs recorded at 6 s intervals for the 1993-2001 surveys, and a Sea-Bird SBE-39 bathythermograph continuous data recorder at 3 s intervals for 2002-present). Bottom temperature is then averaged over the on-bottom portion of the trawl to produce a single value per station per year (see Buckley et al., 2009;Lauth et al., 2011, for details of data collection and postprocessing). Here, we use bottom temperature measurements to verify that our model properly captures the characteristics of the Bering Sea cold pool. The cold pool is quantified by indices that represent the fraction of the survey area with bottom water less than 0, 1, or 2 • C water. In the model, we define the Eastern Bering Sea shelf as all grid cells falling within the Eastern Bering Sea stratified sampling regions 10-62 (Fig. 2). We calculate the model cold pool index using two methods. First, we calculate the index value on 1 July of each year; choosing a fixed date allows us to compare a consistent summer snapshot of the cold pool from year to year. The second index replicates the sampling scheme used in the groundfish survey, choosing bottom temperatures from the closest grid cell and time slice to each observation point; this index allows a better comparison to the observations, given the temporal spread in the observations between the first (typically southeast) and last (northwest) sampled station. For comparison, we also analyze bottom temperature extracted from the Climate Forecast System ocean model; this lower-resolution climate model is coupled to the same atmospheric forcings we use in our hindcast simulation for this time period and therefore allows comparison between the original climate model and our dynamically downscaled representation.

Inner, middle, and outer shelf domains
The wide, shallow Eastern Bering Sea shelf is divided into three domains (Fig. 1), each characterized by distinctive patterns of stratification and mixing (Coachman, 1986;Kachel et al., 2002;Stabeno et al., 2012). The inner domain stretches from the coast to approximately the 50 m isobath and is well mixed year round by both tidal and wind energy. The middle domain reaches from the 50-100 m isobath; this domain is well mixed during winter months but thermally stratified during the spring and summer, with a tidally mixed bottom layer isolated from the surface waters. The outer shelf domain, reaching from the 100 m isobath to the shelf break (approximately the 200 m isobath), more closely reflects the seasonal stratification of an oceanic system, with a tidally mixed bottom layer that is less sharply separated from the surface layer than in the middle domain (Coachman and Charnell, 1978). Tidal mixing dominates the energy across the entire shelf, with a very small net transport northward (Coachman, 1986).
We estimate stratification in the model by calculating the potential energy required to mix the water column (SI, in units of J m −2 ), after Simpson et al. (1977): where ρ is density, h is the depth of the water column, ζ is the height of the free surface, g is gravitational acceleration, and z is depth relative to mean surface height (i.e., ζ = 0).
We also calculated the location of the structural front separating the well-mixed inner domain from the thermally stratified middle domain, defining the front as the 0.5 • C m −1 contour in maximum vertical temperature gradient (after Schumacher et al., 1979;Kachel et al., 2002). We apply this calculation to the years 2000-2010 in the hindcast simulation; this period spans multi-year cold and warm periods, and therefore encompasses a good deal of the variability one might expect to see in this property.

Spatial and temporal patterns in primary production
The physical geometry of the Bering Sea, along with the seasonal presence of sea ice, leads to a diverse set of controls on primary production, varying in both space and time. For validation purposes, we focused on a few features that best reflect this complex balance of macronutrient, micronutrient, light, and temperature control of bottom-up processes in this ecosystem.
The highest sustained productivity in the Bering Sea is seen near the edge of the shelf break. This region, referred to as the "Green Belt", coincides spatially with both the shelf break and the Bering Slope Current that carries water northward along the shelf slope (Springer et al., 1996). The shelf-break front and eddies drive high primary productivity both by supplying macronutrients (i.e., nitrate) from the deep basin and micronutrients (i.e., iron) from the shelf and by physically entraining phytoplankton (Okkonen et al., 2004). The Pribilof Islands provide an additional source of dissolved iron to the Green Belt region (Aguilar-Islas et al., 2007). Variability in mesoscale eddies in the Bering Slope Current is a primary driver of productivity variability in the Green Belt (Okkonen et al., 2004;Mizobata and Saitoh, 2004). The initial spring bloom here is dominated by diatoms but transitions to smaller species in the late summer (Springer et al., 1996).
Another hotspot of production in this region is the Pribilof Islands. Their location disrupts flow along the 100 m isobath, which leads to enhanced tidal mixing and introduces nutrients from the deep basin to this area, leading to high summer productivity. Productivity can be lower during years when mixing is decreased due to salinity-related frontal structures propagating from the inner shelf (Stabeno et al., 2008).
While the Bering Sea is generally characterized as being very productive, this production is almost entirely driven by the on-shelf regions and the Green Belt. The deep basin, in contrast, is a high-nutrient, low-chlorophyll (HNLC) region, with low iron levels limiting primary productivity year round (Aguilar-Islas et al., 2007;Suzuki et al., 2002;Leblanc et al., 2005).
On the wide eastern shelf, primary productivity is mainly controlled by the balance of stratification-induced changes in light and nitrogen limitation. An initial spring bloom, dominated by diatoms, rapidly depletes the surface macronutri-604 K. Kearney et al.: A coupled pelagic-benthic-sympagic biogeochemical model for the Bering Sea ents along much of the shelf (Sambrotto et al., 1986). In the marginal ice zone, ice-edge blooms can occur, accounting for a large fraction of the total spring bloom (Niebauer et al., 1995). This ice-edge bloom occurs during years where ice lingers later over the shelf, protecting the underlying water from wind mixing and setting up stronger stratification; earlier-melting ice leads to more wind mixing and a later spring bloom. Variations in spring bloom timing, its correlation or lack thereof with ice melt date, and the impact of this timing on community composition and energy transfer to higher trophic levels form the backbone of most prevailing theories of ecosystem function in the southeastern Bering Sea (Hunt et al., 2010;Sigler et al., 2016). Later summer and fall productivity can be driven by wind mixing events that mix nutrient-rich bottom water into the surface layer. Measurements along the middle and outer shelf regions indicate that nitrate drawdown accounts for 30 %-50 % of observed carbon uptake, with the remaining 50 %-70 % driven by ammonium (Sambrotto et al., 1986;Whitledge et al., 1986).
To check for these patterns in primary production, we performed a visual comparison of modeled phytoplankton biomass patterns with satellite chlorophyll estimates. The satellite chlorophyll values used were climatological monthly averages from MODIS Aqua Ocean Color Index (OCI) algorithm 4 km resolution product, spanning the period of July 2002-July 2015 (NASA Ocean Biology Processing Group, 2017). We compared these chlorophyll patterns to optically weighted, depth-integrated phytoplankton chlorophyll from the model over the same time period, assuming an attenuation length scale of 45 m. While chlorophyll is not a direct measurement of biomass, particularly when derived from satellite color in high-latitude locations, and our modelderived estimate of satellite-visible chlorophyll is a rough one, it is in this case sufficient to allow large-scale comparison of general spatial patterns in biomass between the various biophysical regions of the Bering Sea (e.g., basin versus shelf, north versus south).
We also looked at chlorophyll data measured via fluorometer at the long-term biophysical mooring at station M2 (56.87 • N, 164.05 • W) (Stabeno et al., 2012). This mooring provides a more detailed look at both surface and subsurface chlorophyll over several years, including during times of ice cover.

Plankton community composition
To evaluate plankton community composition, we focused on a few patterns of relative biomass seen in the Bering Sea. The spring bloom is typically dominated by diatoms, with small phytoplankton numbers increasing in summer and fall (Springer et al., 1996). Microzooplankton, consisting primarily of protists such as ciliates and dinoflagellates, are the primary grazers on both large and small phytoplankton size classes; measurements of the biomass of microzooplankton vary on order of 10-100 mg C m −3 (Olson and Strom, 2002;Howell-Kübler et al., 1996). The mesozooplankton community is dominated by larger zooplankton. Though numerically abundant, the small copepod species typically compose less than 10 % of the zooplankton biomass (Vidal and Smith, 1986). Amongst the larger zooplankton groups, the dominant species vary on and off the shelf. The offshore community is dominated by oceanic copepod and euphausiid species, such as Neocalanus spp. and Thysanoessa inermis (NCaO and EupO in the model, respectively), while on the shelf region Calanus marshallae and Thysanoessa raschii (NCaS and EupS, respectively) compose the majority of the mesozooplankton population. Biomass measurements for these larger mesozooplankton groups vary on the order of 1-10 mg C m −3 Hunt et al., 2016). The offshore and onshore euphausiid groups are distinguished by differing survival strategies. The shelf-edge T. inermis build up high lipid stores used for overwinter survival and spawn early in the spring, without the need to feed on the spring bloom for spawning, while the shelf-dwelling T. raschii rely more heavily on detrital feeding during the winter and spawn later, after feeding on the spring bloom (Hunt et al., 2016, and references therein).

Sea ice
The improved sea ice model in this version of the Bering10K model demonstrates high skill in reproducing the advance of sea ice across the domain and in capturing the interannual variability of the location of maximum ice extent (Fig. 3). Over the entire 1980-2018 time series covered by satellite observations, the ice-edge location along 170 • W shows a small southerly bias of 0.3 • latitude (33.4 km) compared to the location measured via satellite; this is improved from the previous 0.56 • (62.3 km) southerly bias seen in the Hermann et al. (2016) version of the ice model. However, despite improvements compared to previous versions of the model, ice retreat still lags observations by approximately 2 weeks in the early spring.

The cold pool
The Bering10K bottom temperature values clearly reproduce both spatial and temporal variability in the location and extent of the cold pool (Fig. 4). During low-ice years, the cold pool is primarily located in the northwest portion of the eastern shelf, while in colder, high-ice years it extends throughout much of the middle domain and into Bristol Bay.
For a more quantitative assessment of skill, we looked at the correlation between annual time series of mean bottom temperature and cold pool index values in the groundfish survey dataset versus the models (Table 1, Fig. 5). The Bering10K model values correlate very strongly with the observed values; correlation is highest when calculating the cold pool extent as defined by the 0 • C threshold (R 2 = 0.940), and R 2 values remain above 0.87 for all other metrics. To summarize model skill, we use a model efficiency metric after Stow et al. (2009), where a model efficiency value greater than 0 indicates more skill than a simple average of the observed time series, and a value of one indicates a perfect match to the observed time series. The model efficiency for the Bering10K model is high across all metrics, ranging from 0.714 to 0.8. This is in sharp contrast to the mean bottom temperature and cold pool index metrics estimated by the coarser-resolution Climate Forecast System model. The cold pool produced by the CFS model lacks the detailed structure of colder waters seen in the observations and fails to simulate bottom waters below the 1 • C threshold. The model efficiency metric (MEF) suggests that the CFS model has much less skill in predicting mean interannual bottom temperature (MEF of 0.406), with only a marginal ability to capture the 2 • C cold pool and no skill at all with respect to waters less than 1 • C. This indicates that the dynamic downscaling offered by the higher-resolution Bering10K model is a necessary component in reproducing this feature.

Inner, middle, and outer shelf domains
Before analyzing whether the model properly reproduces the variations in vertical structure and mixing in the three shelf domains, it is important to note that the location of isobaths in our model are offset slightly compared to the real shelf bathymetry. Sigma-coordinate models such as the ROMS model used in this study are subject to computational errors in the horizontal pressure gradient along regions where topography is steep or the vertical gradient in a property is large (Shchepetkin, 2003); this issue often necessitates applying a smoothing filter to the bathymetry (Danielson et al., 2011). As a result, our modeled outer domain, as defined purely by location of isobaths, is generally narrower than that seen in any observations (Fig. 6), particularly in the vicinity of the Pribilof Islands.
The simulated patterns in vertical stratification follow those expected across the three domains (Fig. 7). During the winter, the majority of the shelf is well mixed vertically. Stratification first appears in early spring along the outer domain and by summer is also seen throughout the middle domain. Given the relatively coarse vertical resolution of our model, the distinction in the vertical structure between the middle and outer domains is not well resolved. However, the model does reproduce the structural front, also known as a tidal front, expected between the unstratified inner domain and thermally stratified middle domain during the summer months. The exact location of this front varies both seasonally and from one year to the next but is generally located just inside the 50 m isobath (Fig. 8). The front location is relatively consistent across the southern shelf region, though possibly further inshore than would be predicted by the 50 m isobath near Cape Newenham at the north end of Bristol Bay; it moves further offshore and its location becomes more tem-  porally variable further north. The clear structural front begins to disappear north of Nunivak Island. Stratification in the northern Bering Sea and Norton Sound area is much more strongly influenced by salinity, especially near the large outflows of the Yukon River, and in this region the clear demarcation between inner, middle, and outer domains disappears.
Horizontal movement in the model, as expected, is dominated by tidal frequencies across the shelf domain, with low annually averaged net velocities. There is a small net counterclockwise flow along the southern edge of the eastern shelf and then northward within the inner domain, with a small net transport from off-shelf to inner shelf waters ( Fig. 9). Water entering the Bering Sea from the Gulf of Alaska through Unimak Pass moves alongside and onto the eastern shelf and travels northward; it takes approximately 7-8 months to reach the northern shelf region (i.e., 60 • N) along the 100 m isobath, in line with drifter-derived measurement of this flow . Further north, modeled velocities are slightly slower than those seen in the observations, with water taking approximately 13-15 months to reach the Bering Strait from the Unimak Pass in the model compared to 9-13 months in the observations; this may reflect a weak Anadyr Current in this region or alternatively be the result of missing flow from off-shelf through submarine canyons that are not well resolved by the modeled bathymetry. Overall, flow within the modeled Bering Sea reproduces the important circulation patterns within this region. However, cyclical circulation patterns seen near the

Spatial and temporal patterns in primary production
Satellite ocean color measurements suggest that phytoplankton blooms in the Bering Sea first reach observable levels of chlorophyll in late February to early March, primarily on the eastern shelf in regions recently vacated of ice ( Fig. 10). As light levels and temperature increase throughout the domain in summer, chlorophyll levels increase both on the shelf and along the shelf slope but remain low over the western side of the basin, where iron is limiting. The bloom peaks in late May to early June, then steadily decreases through September. A late fall bloom, smaller in magnitude than the earlier spring bloom, can be seen in October along the eastern shelf. While the BESTNPZ model produces annual cycles of primary productivity of approximately the correct magnitude compared to these observations, it does not capture many of the nuances in spatiotemporal variability (Fig. 10). In early spring, the model does not appear to capture the early iceassociated growth along the eastern shelf. Within the pelagic ecosystem, low growth rates governed by low temperaturemediated maximum production rates coupled with strong light limitation prevent any significant accumulation of phy-toplankton. Measurements at the M2 mooring location suggest that peak spring bloom date varies widely, from mid-April to early June (Sigler et al., 2014); in the model, peak bloom timing is constricted to a much narrower window from early to late May. While concentrations within the ice algae state variable can reach approximately 70 mg m −3 (monthly climatological average) within the thin skeletal ice layer, this biomass does not contribute significantly to the pelagic large phytoplankton concentration once ice melts due to dilution coupled with unfavorable growth conditions in the underlying water. We do not include these ice algal numbers in the optically weighted chlorophyll numbers used in our satellite comparison because these satellite measurements do not typically capture the spectral signals of ice algal pigments (Wang et al., 2018).
The spring bloom in the model begins once light and temperature levels increase in April. The first stages of the bloom resemble observations, with concentrations highest along the shelf slope and along the western shelf. However, rather than producing a short-lived bloom that drops once macronutrients are exhausted, the model allows for a sustained summer bloom. This bloom is driven by regenerated production; ammonium is produced from phytoplankton and zooplankton respiration, as well as quick remineralization of egested detritus, especially of the slow-sinking detritus group fed by small phytoplankton non-predatory mortality and microzoo- Note that fronts over water deeper than 200 m may reflect artifacts of the coarsening vertical resolution rather than true changes in vertical gradients. plankton egestion and non-predatory mortality. This pattern is seen both on the eastern shelf and throughout much of the deep basin, where iron does not appear to play its expected role in limiting primary production. In the basin, production levels fall off as macronutrients are exhausted in early July; on the eastern shelf, however, high fluxes of ammonium from the benthos drive sustained production throughout the summer and into early fall. In late fall, modeled chlorophyll levels appear more similar to the satellite patterns, with production primarily limited to the middle domain of the eastern shelf.
The late bias in the spring phytoplankton bloom is also evident when comparing model output to mooring measurements at the M2 mooring (Fig. 11). The model captures the predominant bloom characteristics: the bloom begins with a large, diatom-dominated bloom starting in the near-surface waters and then migrating deeper as surface nutrients are depleted, then decreases during the summer months, with some subsurface chlorophyll remaining at the bottom of the mixed layer. However, the modeled bloom begins in mid-April, on average, later than the mid-March to early April start seen in the mooring data. Fall blooms in September and October are spurred by increased mixing and are short and localized in the observations; the modeled fall bloom matches the timing in the observations well.

Plankton community composition
The phytoplankton community composition in the model reflects the expected balance between the small and large functional groups (Fig. 12). Throughout the majority of the domain, the spring bloom is heavily dominated by the large phytoplankton group. The one exception to this is along the shallow, well-mixed inner domain, where low macronutrient levels favor the small phytoplankton group for most of the year, with a small contribution of large phytoplankton in early summer when the bloom begins there. In the inner part of the middle shelf, the large phytoplankton levels decrease following the initial bloom but small phytoplankton biomass continues to rise through the fall. Moving further outward along the shelf, late summer and early fall biomass is low across both functional groups.
Within the zooplankton community, we see little variation between the relative dominance of the functional groups across the shelf transect (indicated by gold circles and labels in Fig. 1). In all locations, microzooplankton are the dominant group. However, their biomass is often only slightly higher than that of the summed mesozooplankton groups. Within the mesozooplankton groups, very little variation in their relative contribution to the biomass pool is seen either spatially or temporally. The only big change to zooplankton community coincides with the hardcoded diapause movement of the two large copepod groups; because these groups cease grazing when they enter diapause, their populations quickly drop during the diapause period. Offshore large copepods (NCaO) die if they encounter the ocean floor prior to reaching their prescribed overwintering depth of 400 m, and this effectively keeps this functional group constrained to deep-water locations. Lacking any similar depth-based restrictions on their process rates, the remaining large zooplankton groups can be found throughout the domain, regardless of whether they are designated as onshore or offshore in name.
The model does capture a gradient in timing of the zooplankton population, with offshore populations being established early in the spring, while on-shelf populations do not appear until early summer. However, limited observations suggest that early spring offshore zooplankton increases precede the spring phytoplankton bloom offshore Harvey et al., 2012). This timing difference is not captured by the model; instead, the overwintering population in the model is reduced to negligible amounts and only begins to increase again once the primary productivity in the region reaches sufficient levels.

Discussion
The Bering10K model correctly reproduces a variety of physical processes known to influence primary production in the Bering Sea. Overall patterns of sea ice cover, including interannual variations in the maximum extent of sea ice and the date of ice retreat, are well captured by the model. The exact date of sea ice retreat tends to lag observations by approximately 2 weeks. This lag could, theoretically, lead to subsequent problems in the timing of phytoplankton blooms; in the model's current state, ice algae and ice-associated blooms are poorly resolved regardless of ice melt accuracy or lack thereof, but improving the ice retreat timing should remain a focus of future improvements to the physical model.
The location and extent of the cold pool in the Bering Sea is often used as an index of biophysical variability across the Bering Sea shelf (e.g., Siddon and Zador, 2018), and therefore capturing both spatial and temporal variation in cold pool extent is key for a model to be useful in this region. The Bering10K model performs well on this point, with very high correlation to observations and very small biases. It is encouraging to note that the location of the simulated cold pool in 2016 is offset to the northwest compared to other warm years, replicating the position seen in the observations. During this year, anomalously warm water from the northeast Pacific (Bond et al., 2015) prevented the cold pool from extending as far southeast as it typically does during the summer months. The ability of the model to capture these anomalous conditions lends promise to its capability to simulate novel conditions that may arise when simulating future conditions.
The Bering10K model shows good replication of both cross-shelf and along-shelf differences in stratification. In the south, we see a distinct well-mixed inner domain, with a sharp transition to a stratified middle/outer domain occurring near the 50 m isobath during summer. The model's distinction between the middle and outer domains is less defined than in observations, likely due to the limited 30 vertical layers used in the model combined with bathymetric smoothing. In the northern portions of the eastern shelf, these thermal stratification domains disappear. Salinity is more variable in the north than in the south, driven by both sea ice formation and melt, as well as the large freshwater contribution of the Yukon River. As a result, stratification in the north is driven more strongly by salinity than in the south.
While the physical dynamics of the model perform well within the Bering Sea itself, the modeling domain south of the Aleutian Islands should be treated with caution, as it is close to the model's open boundary and artifacts associated with boundary conditions are expected.
Despite its strong skill in replicating the underlying physical features of the Bering Sea that are thought to influence  primary production, the Bering10K-BESTNPZ model has limited skills in reproducing observed spatial and temporal patterns of primary production. The degraded performance in the biology realm is due to several interacting deficiencies in model process equations and parameterizations.
Throughout the deep basin, according to observations, iron levels should be low and limit primary production. The model includes only a simple representation of iron, using continuous relaxation to an empirical depth profile. While the low surface concentration prescribed in our model's basin is consistent with observations, this mechanism of replenishment through relaxation is not one that reflects the true complexity of iron cycling in the ocean. In their observations of phytoplankton growth rates in the Green Belt along the slope, Aguilar-Islas et al. (2007) noted that even in this highly productive location, the diatoms showed signs of iron stress, and dissolved iron levels remained low here compared to the shelf. They hypothesized that production in this region was maintained at its observed level due to a small but per-sistent source of iron being mixed from deep water along the shelf break, rather than a large iron source that fully alleviated iron limitation. The climatological nudging used in our model provides exactly this -a small but persistent source of continuous dissolved iron -throughout the domain, rather than only along the shelf slope. In order to properly capture the HNLC characteristics of the deep basin, a more mechanistic model for iron, with an explicit source near the sediments only rather than throughout the water column, is likely necessary.
Across the eastern shelf, in terms of mean seasonal cycles, modeled phytoplankton biomass and primary production levels are more in line with observations and reflect the dominant seasonal pattern observed on the southeast middle shelf of a strong spring bloom, followed by low summer biomass, then a smaller late fall bloom. However, many of the prevailing hypotheses of energy flow in the Bering Sea focus not on the mean state of the phytoplankton bloom but rather on interannual variability, particularly related to the interplay between temperature, stratification, and nutrient availability during the initial stages of the spring bloom (Stabeno et al., 2001;Coyle et al., 2008;Hunt et al., 2010). In particular, the timing of the spring bloom, and its correlation or lack thereof with ice retreat timing, forms the basis for many theories of energy transfer within the eastern Bering Sea shelf ecosystem. Given the key role that phenological variability plays in the predominant theories of energy transfer, shortcomings in the model's ability to capture the processes leading to such variability raise concerns about its potential ability to predict either current or future changes in primary and secondary production.
The first issue is that the spring bloom begins nearly a month later than it should. This is particularly apparent in the north (Fig. 10), where observations indicate that phytoplankton blooms should occur both on and under the sea ice. In our model, ice algae biomass is insignificant compared to pelagic phytoplankton biomass, and pelagic production is too strongly limited by both temperature and light for any significant growth to begin during March or April as it should. Cur-rently, the lack of early spring growth also leads to a failure of the model to react to interannual differences in ice extent and retreat timing.
The limit on ice algae biomass in our model is primarily a limitation of the conceptual framework imposed on this state variable. By assuming that the ice algae are confined within the very thin skeletal ice layer, while at the same time allowing a continuous convective exchange (see Sect. A3.8) between this layer and the surface layer of the water column, modeled ice algae can never grow to much higher concentrations than would be found in an equivalent thin layer of water. This framework does not account for other aggregation types often seen in ice algal communities, such as nets or strands on the underside of the ice surface (Ambrose et al., 2005). Once the ice melts and the ice algae are released into the water, the model framework immediately transfers this pool of biomass to the large phytoplankton group, where it is then subject to the same controls on growth rate as the pelagic-originating phytoplankton.
For pelagic phytoplankton groups, the inability to capture early ice-associated blooms is primarily an inadequacy of the equations and parameters chosen to represent photosynthetic processes. The parameters that define each group's photosynthesis-irradiance curve, as well as those setting the maximum potential light-and nutrient-replete growth rates (a function of temperature), originated from a comparable model for the Gulf of Alaska (Hinckley et al., 2009). However, the phytoplankton community of the Bering Sea includes many Arctic species that are physiologically adapted to grow in both lower temperatures and under a wider range of light levels experienced near and under the sea ice. For example, ice-associated blooms occur in very thin upper layers of the water column left behind by ice melt; these layers are typically only 1-2 m thick and characterized by temperatures of around −1.7 • C (Hunt et al., 2010). At this temperature, the model equations require approximately 2 W m −2 surface irradiance to balance respiration and non-predatory mortality costs, even in the absence of any nutrient limitation or grazing losses. But modeled under-ice surface irradiance typically remains below this level when ice of more than approximately 0.5 m thickness is present. Therefore, the chosen set of equations do not appear appropriate to reproduce the dynamics of under-ice and ice-edge blooms. We also note that while the parameters and equations controlling the maximum potential growth rate of phytoplankton (Eq. A16) produce reasonable rates within the temperature ranges seen in the hindcast period in this geographical domain, they increase exponentially above this range, well outpacing respiration rate increases; a temperature increase of 5-10 • C would push these rates well beyond the physiological limits of phytoplankton division rates.
Another key problem seen for phytoplankton across the shelf is the absence of any strong macronutrient limitation following the initial spring bloom. Observations indicate that ammonium can reach high quantities (up to 15 mg m −3 ) following the initial bloom due to both phytoplankton decomposition and benthic processes (Whitledge et al., 1986;Aguilar-Islas et al., 2007). However, this ammonium is typically concentrated in the deeper shelf water in the observations, while the model tends to accumulate material in a subsurface layer, with continuous high turnover of ammonium in surface waters of the shelf from spring through fall. This likely indicates that decomposition processes in the model are proceeding too quickly, particularly in the slow-sinking detrital group whose coupled remineralization rates and sinking rates result in this material only reaching about 50 m in depth before being completely converted to ammonium. Small phytoplankton mortality and excretion as well as excretion and egestion by microzooplankton are the primary sources for the slow-sinking detritus group. The model also produces a very robust zooplankton community that persists from the start of the spring bloom until well into the winter months (mid-December), and whose byproducts of respiration, excretion, and egestion feed into this rapid, continuous cycle of regen-erated primary production (see the Supplement). In general, the process equations concerning the sinking and remineralization of organic material, both in the water column and on the seafloor, are very simplistic (see Sect. A3.7 for details). A single timescale for remineralization is used for all detrital groups, with no distinction between the lability of different pools of organic material. There is also no mechanism available to account for aggregation of material, which could lead to faster sinking speeds and lower remineralization rates of organic matter. Because of the broad, shallow nature of the Eastern Bering Sea shelf, remineralization rates play a strong role in determining the concentrations of macronutrients and the strength of benthic-pelagic coupling. These processes should be a focus of future development in this model.
In contrast to the under-resolved detrital pools, the mesozooplankton groups included in the BESTNPZ model appear to be over-resolved in terms of functional differences capable of being differentiated from each other with this type of biomass box model. The prey preferences encoded into the feeding behaviors of each group produce only very small variations in their relative contribution to the overall biomass pool. Instead, the five mesozooplankton groups (small copepods, on-and offshore large copepods, and on-and offshore euphausiids) effectively function as a single herbivorousmicrozooplanktivorous zooplankton functional group. The only deviation from this synchrony is seen in the large copepod groups, whose populations fall when they enter diapause and cease grazing. The structure of the model also leads to plankton dynamics that are constrained primarily by the balance of instantaneous production and loss rates. While appropriate for simulating phytoplankton bloom dynamics, this style of model is not well suited for capturing the dynamics of zooplankton life stages (for example, the necessity of overwinter survival in order to spawn a new generation in the spring) or the nuanced differences between the survival strategies of various species. Mesozooplankton populations drop to a negligible level during the winter due to the absence of sufficient prey to balance continued respiratory and nonpredatory losses, and overwintering success (or lack thereof) has almost no effect on the resulting zooplankton populations in summer. In order to truly capture the gradients in relative success of different copepod and euphausiid groups throughout the domain, a model that better captures winter survival strategies (e.g., a life-stage-resolving model or another means of introducing latency between feeding, respiration, and non-predatory mortality) is likely necessary.
Given the deficiencies identified in this evaluation, future work will comprehensively reevaluate each component of the existing model. More accurate simulation of underice and near-ice phytoplankton blooms may be addressed by allowing seasonal plasticity in the parameters defining the photosynthesis-irradiance curve for each phytoplankton group; when used in a simple nutrients-phytoplanktonzooplankton-detritus (NPZD)-style model, this type of equation has been shown to better capture the magnitude and timing of Bering Sea blooms than constant parameters (Sloughter et al., 2019). For sea ice algae, Tedesco and Vichi (2014) note that models using a fixed-thickness skeletal ice layer tend to underestimate production in first-year ice; they suggest that varying the width of the sea ice layer in which algae is found as a function of sea ice permeability can help overcome this issue with minimal additional model complexity required. Issues related to excessive regenerated production on the eastern shelf may be addressed by more closely examining the detrital functional groups within the model and the remineralization timescales associated with each; the use of a single remineralization timescale for all detrital groups is out of step with most modern biogeochemical models (e.g., Moore et al., 2002;Aumont and Bopp, 2006;Dunne et al., 2012), and allowing for parameters that reflect the varying lability of different detrital pools may better capture the nutrient dynamics both on and off the shelf. Improving the EBS nutrient budget may also require a more complex representation of the benthic component of the ecosystem; the benthic module from a mature shelf model such as ERSEM (Butenschön et al., 2016) may offer a blueprint for future development related to benthic functional groups. Finally, we intend to reconsider the number of functional groups used to represent the planktonic consumers within this ecosystem. Banas et al. (2016) demonstrated that a much simpler six-box model was capable of capturing spring bloom dynamics representative of the M2 mooring location. However, Friedrichs et al. (2007) cautioned that though simple models are typically able to be tuned to better simulate the ecosystem dynamics of a single location, their portability is limited compared to their more complex counterparts. Given the rapidly changing conditions in the Bering Sea, and the wide range of applications for which this model was designed (ranging from hindcast-based process studies to longterm climate change forecasts), we must carefully consider the tradeoffs of parsimony versus complexity.

Conclusions
Overall, the BESTNPZ model coupled to the Bering10K regional ocean model demonstrates considerable skill in replicating observed horizontal and vertical patterns of water movement, mixing, and stratification, as well as the temperature and salinity signatures of various water masses throughout the Bering Sea. However, its ability to replicate largescale patterns in nutrient cycling, primary production, and zooplankton community composition, particularly with respect to the interannual variations that are important in a fisheries management context, is limited. In its current form, the Bering10K model can offer key insights into the physical processes that may affect higher-trophic-level species directly. In particular, it offers a useful supplement to examine physical features in areas and at times of the year that are difficult or impossible to survey due to sea ice cover or harsh weather. However, we caution that the use of the biological state variable output should be limited until the model is better able to capture observed characteristics of the Bering Sea phytoplankton and zooplankton communities.

A1 Summary and notation
This section provides a mathematical overview of processes through which biological state variables exchange material with each other in the BESTNPZ model.
The BESTNPZ model assumes a model geometry that includes N water column layers, a single benthic layer of unspecified depth, and a skeletal ice layer with a constant thickness h s ice . The skeletal ice layer refers to the base of an ice sheet; this very thin layer is characterized by a looser crystal structure than the more solid ice overlying it and is the site of the most rapid algal growth in ice (Arrigo et al., 1993;Jin et al., 2006). Within this geometry, BESTNPZ tracks the concentration of 19 biological state variables (Table A1, Fig. A1).
Exchange of material between these state variables, and across vertical layers within a single state variable, results from a variety of processes. In the code, and in this description, these processes are divided into three types.
The first type of process, described in Sect. A2, includes redistribution of state variables due to movement of the water or ice in which they reside. The majority of these calculations (e.g., advection and diffusion of water and ice) take place outside the biological module and follow the default ROMS behavior for biological tracer variables. The one exception is the exchange of NO 3 and IceNO3, NH 4 and IceNH4, and PhL and IcePhL due to the formation or loss of ice in a grid cell.
The second process type we term source-minus-sink processes (Sect. A3); these processes take place within a single depth layer and involve transfer of biomass from one state variable to another. For notation, each source-minussink flux process is represented in this document as a function of the source and sink state variables, respectively. For example, Abc(X, Y ) is the flux rate of material from group X to group Y via the Abc process.
The final category (Sect. A4) is vertical movement, where the concentration of state variables is redistributed within the water column due to the sinking or rising movement of the state variables within the water (note that this is separate from vertical advection of tracers due to movement of the water itself).
The three types of processes are calculated sequentially in the code, such that changes due to ice loss and formation are calculated first, followed by the total rate of change due to source-minus-sink processes, and finally redistribution due to vertical movement.
Several of the equations in this section rely on state or diagnostic variables that come from the physical model or from the ROMS grid geometry. See Table A2 for a description of these variables and their notation in this document. Addi-tional parameters derived from biological input parameters are listed in Table A3.
To express the concentration of a biological state variable X, we use the [X] notation, with units corresponding to those in Table A1. All intermediate fluxes are expressed in terms of mg C for simplicity. Conversions between units assume constant stoichiometry for all living and detrital groups.
Because many of the variable names used in these equations involve multi-letter and mixed-case notations, we have chosen to use dot notation for all instances of multiplication in this document. Please note that this indicates simple element-by-element multiplication in the BESTNPZ code, not a dot product.

A2 Ice formation and loss
Although the primary ROMS sea ice model tracks ice presence in terms of fraction grid cell coverage, the BESTNPZ biological module uses a simpler scheme where ice presence is treated as a binary condition. When the ice thickness (h ice ) in a grid cell is greater than the prescribed thickness of the skeletal ice layer (h s ice ) and the grid cell has at least 50 % ice cover (as indicated by the a ice variable), we assume the entirety of that grid cell now supports the ice-related biological processes in a thin layer of skeletal ice covering the entire grid cell and located just above the free surface of that grid cell.
If sea ice appears in a grid cell between the previous time step and the current time step, the large phytoplankton, nitrate, and ammonium in the top layer are redistributed such that the top water column layer and the skeletal ice layer receive equal concentrations of each tracer by volume.
Likewise, when ice disappears between the previous step and the current one, all material in the skeletal ice layer is moved to the top layer of the water. [

A3 Source-minus-sink processes
Unless otherwise specified, all processes detailed in this section are specific to a single layer. We have used the subscript k when defining each layer-specific flux rate; k can be either the index of a water column layer or "sice" to indicate the skeletal ice layer. This notation distinguishes between rates that are specific to a single layer and use volumetric units (mg C m −3 d −1 ) and those that exchange material between layers and are expressed in total flux across a boundary (mg C m −2 d −1 ). To avoid clutter, we have chosen not to apply these subscripts to any remaining layer-dependent variables (such as state variable concentrations), but these are to be assumed for all pelagic and ice layer components.

A3.1 Light attenuation in water
The model assumes that radiation is attenuated with depth as follows: where f PAR is the fraction of surface light that is photosynthetically available, I 0 is the surface irradiance, and K PAR is the light attenuation coefficient for photosynthetically active radiation (i.e., 400-700 nm). Incoming radiation is supplied by the physical model and converted to photon flux. The attenuation coefficient is itself the sum of attenuation from clear water, chlorophyll, and other sediment and organic material: The first two terms in Eq. (A14) are derived from Morel et al.'s (1988) analysis of light attenuation in Case I waters. The final terms (i.e., the K D portion plus K C constant) add additional attenuation based on the depth of the water column; this approximates the assumption that sediment and organic material is higher near the coastline than in open water. The power law formula was chosen based on a fit to satellitederived inherent optical properties across the Bering Sea domain (see Appendix B for further details).
The K PAR parameter is also used to calculate shortwave radiation decay in the physical model. By default, ROMS uses the equations of Paulson and , which consider the differing attenuation length scales for blue-green light versus shorter-and longer-length wavelengths that are primarily attenuated in the upper 5-10 m of the water column. When coupling to the BESTNPZ model, we modify this equation to substitute our custom PAR attenuation length scale for the blue-green portion of the spectrum: The values of a frac and a µ 1 correspond to R and ζ 2 in Paulson and , with our K PAR replacing Paulson et al.'s (1977) ζ 1 ; we use the parameter values for Case I waters.

A3.2 Gross primary production
Primary production for both small and large phytoplankton is governed by the same set of equations. The maximum photosynthetic growth rate per unit chlorophyll (mg C (mg Chl a) −1 d −1 ) is a function of temperature and defined in terms of each group's doubling rate D i and doubling rate exponent D p (Frost, 1987). The maximum uptake rate is calculated in both carbon-specific and chlorophyll-specific units: This rate is moderated by light and nutrient limitation. Light limitation uses a hyperbolic tangent function, after Jassby and Platt (1976): Nutrient limitation is based on the availability of nitrate, iron, and ammonium. Nitrate and ammonium limitation terms follow Frost and Franzen (1992), with nitrate uptake inhibited by ammonium when the latter is high relative to its halfsaturation parameter: Iron limitation follows a similar Michaelis-Menton form but with an additional term to enforce saturation at a critical threshold value: Nitrate uptake is controlled by the minimum limitation factor between light, nitrate, and iron, while ammonium uptake is limited by light or ammonium: Gpp(NO 3 , X) k = P max · [X] · min Lim NO 3 , Lim Fe , Lim I Primary production also occurs in the ice layer when ice is present. In the ice layer, production is a function of light, nutrient limitation, brine salinity, and temperature, following Jin et al. (2006). Light limitation uses the following photosynthesis-irradiance curve; unlike the pelagic production, this one includes strong photoinhibition at higher light levels: where I 0 ·f PAR c I is the photosynthetically active radiation converted to W m −2 .
As in the water column, nitrate limitation uses Michaelis-Menten uptake dynamics with ammonium inhibition (with f r denoting the f ratio between new (nitrate) and regenerated (ammonium) production): Brine salinity (S b ) is not tracked explicitly by the ice model, so instead it is estimated based on a piecewise polynomial fit to ice temperature (T i , tracked by the ice model), following Arrigo et al. (1993): The effect of salinity on ice algae growth rate is also a polynomial fit (Arrigo and Sullivan, 1992): When running in climatological ice mode (CLIM_ICE_1D), where no explicit ice temperature is modeled, ξ s b = 1.0.
The final primary production calculation is then In this case, T k=N is the temperature of the top water layer, used to approximate the temperature of the ice itself.

A3.3 Grazing and predation
Pelagic grazing and predation fluxes are a function of a grazer's or predator's maximum ingestion rate (e Y ), its total prey availability, prey-specific feeding preferences (fp XY ; see Table A7), and the water temperature, using the multiple resource Holling type 3 functional response of Ryabchenko et al. (1997): where Y refers to the predator group, X is a specific prey group, and Z refers to the set of all prey groups of that predator. Note that some of the pelagic groups can graze on ice algae; when preyed upon, the ice algae concentration is adjusted as though it were located in the surface layer of the water: The maximum ingestion rates, e Y , are constant for all groups except large-bodied copepods (NCaS and NCaO). These groups can be parameterized to perform seasonal diapause, and during periods of downward migration, their ingestion rates are dropped to e Y = 0 d −1 . See Sect. A4 for a description of the diapause time-of-year calculation. Benthic processes in BESTNPZ are based on a greatly simplified version of the European Regional Seas Ecosystem model (ERSEM) (Ebenhöh et al., 1995). Benthic infauna graze on pelagic detritus and phytoplankton located within a certain distance of the bottom (currently hardcoded to dw = 1.0 m). The feeding fluxes are defined as follows: As in the pelagic grazing equation, X refers to a single pelagic prey group (PhS, PhL, Det, or DetF), and Z refers to the full set of these four groups. A weight factor, w k,X , is calculated to distribute these losses proportionately across the water column (see Sect. A3.9): where z lo,k and z hi,k are the lower and upper depth limits of layer k. Benthic infauna also graze on benthic detritus, following the same equation but with different parameters for prey threshold and half-saturation values: Here, X refers to a single prey group, DetBen. Note that the water column grazing fluxes are in units of mg C m −3 d −1 , while the benthic feeding fluxes are in mg C m −2 d −1 .

A3.4 Egestion and excretion
Egestion fluxes associated with grazing and predation in the water column are a simple fraction of total prey eaten: Egestion fluxes from the microzooplankton group (MZL) go to the slow-sinking detrital pool (Det); all other egestion fluxes go to the fast-sinking detrital pool (DetF). Infauna egestion and excretion is a bit more complex; it is proportional to the prey eaten, with differing rates for detrital verus phytoplankton prey. The flux is split evenly, with half going to benthic detritus (DetBen) and half to NH 4 . Exc(Ben, DetBen) = 0.5 · ex D · X=det Gra(X, Ben) Exc (Ben, NH 4 ) = 0.5 · ex D · X=det Gra(X, Ben) As with benthic grazing, these benthic excretion fluxes are in units of mg C m −2 d −1 . The flux to NH 4 is assumed to return to the bottom water column layer and is converted to a volumetric flux based on the thickness of that layer (see Sect. A3.9).

A3.5 Respiration
All pelagic producers and consumers except jellyfish respire following the temperature-dependent formulation of Arhonditsis and Brett (2005). The phytoplankton and microzooplankton groups maintain a constant basal metabolic (bm) rate: The larger zooplankton groups substitute a basal metabolic rate that includes a starvation response when prey is scarce: where (The summation relates to the total available prey; see Sect. A3.3 for details.) The large copepod groups (NCaS and NCaO) also include a diapause adjustment, such that their bm rate is reduced to 10 % of the bm parameter value during periods of downward migration. See Sect. A4 for details of the time-of-year calculation for diapause.
Jellyfish respiration also follows a temperature-dependent formula, after Uye and Shimauchi (2005): Infaunal respiration includes terms for both basal metabolism and active metabolism proportional to grazing: Gra(X, Ben) Finally, ice algae respiration uses a metabolic rate linearly proportional to its maximum growth rate, after Jin et al. (2006): As with ice algae production, the surface water temperature is used as a proxy for ice temperature when calculating the temperature component of this rate.

A3.6 Mortality and senescence
Non-predatory mortality losses for phytoplankton groups are formulated as a linear closure term: Microzooplankton losses have the option of following either a linear closure as above (MZLM0LIN flag defined) or a quadratic closure: Note that when switching between the linear and quadratic formulations, the relevant input parameter for the MZL group switches between mMZL and mpredMZL. All larger zooplankton groups use a temperature-mediated quadratic closure term: Non-predatory mortality fluxes from the phytoplankton and microzooplankton groups go to the slow-sinking detritus, while all other non-predatory mortality losses go to the fastsinking detritus. The benthic infauna group includes both a linear and quadratic mortality function: the former to represent senescence and the latter as a predation closure term. Finally, ice algae use a linear mortality rate with a temperature dependence, following Jin et al. (2006): (A52)

A3.7 Remineralization and nitrification
Detrital remineralization is proportional to the temperature and the nitrogen content of detritus, after Kawamiya et al. (2000): The conversion from nitrogen content back to carbon content is done to maintain unit consistency with the other betweengroup fluxes, using the assumption that all living and detrital groups maintain identical C : N : Fe stoichiometry. Both fastand slow-sinking detritus use the same parameters for this process.
Nitrification rate in the water column is also influenced by temperature (Arhonditsis and Brett, 2005): Nitrification in the ice is a simple linear function of ammonium concentration (Jin et al., 2006): As with remineralization, the final nitrification rate values are converted to carbon units simply for bookkeeping purposes; they will be converted back to nitrogen units when used in the final rate-of-change equations (see Sect. A3.9.)

A3.8 Ice interface convective exchange
As an ice sheet grows, dense brine is released from the skeletal ice layer at its base and replaced with seawater; this results in a convective exchange of water and nutrients between the ice and surface water. The BESTNPZ model follows Jin et al. (2006) and sets the rate of this exchange using a polynomial function of ice growth rate: where dH dt is the rate of change of ice thickness (m s −1 ) between the current time step and the previous one. The resulting exchange rate, T wi , is expressed in m d −1 .
The exchange in nutrients then becomes a function of the difference in concentrations in the surface layer of water versus the ice layer. Phytoplankton can be washed out of the skeletal ice layer but not in, so the exchange of ice algae and large phytoplankton assumes a concentration of 0 in the surface water: This equation results in a rate of exchange of material across the boundary (mg C m −2 d −1 ) that can have either a positive value (net movement from ice to water) or a negative value (net movement from water to ice). As in previous sections, the nutrient transport is converted to carbon units here purely for bookkeeping purposes.

A3.9 Total rate of change
The total rate of change for each state variable due to sourceminus-sink processes is calculated as a sum of the rates detailed in the previous sections (Sect. A3.2-A3.8). Recall that in the previous sections, all flux rates taking place within the pelagic water column layers, or within the ice layer, were expressed in volumetric units (mg C m −3 d −1 ), while all processes in the benthos or across the water-ice or water-benthos boundaries were expressed in per-area units (mg C m −2 d −1 ). Terms wrapped in square brackets apply only to the top layer (k = N), while terms wrapped in curly brackets apply only to the bottom layer (k = 1).

A4 Vertical movement and exchanges
All vertical movement in the BESTNPZ model is calculated using a piecewise parabolic method and weighted nonoscillatory scheme, following the sediment settling code from a ROMS sediment model . This scheme allows for fast sinking speeds that may cause material to cross multiple layers, without being constrained by the Courant-Friedrichs-Lewy (CFL) criterion. We have modified this scheme slightly to allow a zero-flux boundary condition to be imposed at a specified depth. We also allow the use of a vertical velocity rather sinking speed; a negative velocity implies sinking, while a positive velocity indicates rising.

A4.1 Sinking of phytoplankton and detritus
Phytoplankton and detrital groups (PhS, PhL, Det, and DetF) are subject to vertical settling, with a constant sinking speed for each state variable.
When material crosses the water-benthic boundary, it is assumed that 20 % of the material becomes biologically unavailable (Walsh et al., 1981;Walsh and McRoy, 1986), and 1 % is lost to denitrification (personal communication with David Shull via Gibson and Spitz, 2011). The remaining 79 % of the flux across the boundary is transferred to the benthic detritus (DetBen). Note that the "denitrification" flux is not tracked explicitly but simply subtracted from the flux reaching benthic detritus; the biomass associated with both burial and denitrification is lost from the system.

A4.2 Copepod diapause
Copepod diapause is included in the BESTNPZ model by imposing seasonal movement through the water column on both large-bodied copepod groups (NCaS and NCaO), coupled with modifications to their feeding and respiration rates during periods of downward movement. The timing of copepod diapause is specified via input parameters for sinking start (S start ), sinking end (S end ), rising start (R start ), and rising end (R end ) day for each group, all specified as days of the year. These four parameters combine to define periods of downward-directed movement (S start ≤ t doy ≤ S end ), upward-directed movement (R start ≤ t doy ≤ R end ), and no directed movement (R end < t doy < S start and S end < t doy < R start ) (during all times, both groups are still subject to passive advection and diffusion).
Earlier versions allowed specification of S start , S end , R start , and R end for the offshore group (NCaO) only, with the onshore group automatically lagged by 30 d behind the offshore group. In the current version of the code, the 30 d lag is assumed when all four parameters are set to 0 for the onshore group (this maintains backward compatibility with older input files that do not include values for the onshore group; BESTNPZ sets missing input parameters to zero by default). Diapause can be turned off for either group by setting all four timing parameters to the same non-zero value.
Onshore copepods migrate to 200 m depth during their diapause period. During downward movement, a zero-flux condition is set at 200 m or at the bottom boundary, whichever is shallower. When migrating upward, a zero-flux condition is applied to the top of the surface layer.
Offshore copepods migrate to 400 m depth for their diapause period. A zero-flux condition is set at 400 m. In shallower waters, any biomass that crosses the bottom boundary is transferred to the benthic detritus (DetBen) group. During upward movement, a zero-flux condition is applied to the top of the surface layer.

A4.3 Euphausiid diel vertical migration
Diel vertical migration is currently implemented for on-shelf euphausiids through the (still experimental) EUPDIEL compilation flag. When defined, sinking and rising velocities are applied such that EupS moves at a hardcoded speed of 100 m d −1 toward a target depth, defined as the shallowest depth layer where photosynthetically active radiation (PAR · I 0 ) is less than 0.5 E m −2 d −1 . This option was turned off during the simulations detailed in this study.

A5 Analytical relaxation of state variables
Iron concentrations throughout the water column are initialized using a vertical profile that prescribes values above 50 m and below 300 m, with a linear interpolation between these two depths. The values of the shallow-and deep-water limits are a function of the bottom depth in a grid cell, with higher values in shallow water and lower values in deep water. The primary source of iron in this region is the sediment, and therefore this gradient between shallow and deep water is intended to capture the iron differences between on-shelf and off-shelf regions.
Iron-related biogeochemical processes are not included in this model. Instead, the iron state variable is continuously nudged towards these prescribed vertical profiles on an annual timescale. The nudging process is implemented using the generic ROMS framework for climatological nudging, with the TNUDG input parameter controlling the strength of the relaxation calculations. Figure A1. Schematic of the BESTNPZ model. Circular nodes represent state variables (gold indicates nutrient, green indicates producer, blue indicates consumer, brown indicates detritus). Edges (lines) represent fluxes between state variables and curve clockwise from source node to sink node. Edge colors indicate process type: green indicates primary production, blue indicates grazing and predation, brown indicates egestion, gold indicates respiration, red indicates remineralization, pink indicates nitrification, orange indicates non-predatory mortality, tan indicates excretion, purple indicates convective exchange, gray indicates sinking to seafloor, and navy indicates freezing/melting of ice.   for the 400-700 nm band assuming a light source of "Sun and sky, daylight" converted from surface heat flux to photon flux assuming seawater density is equal to 1025 kg m −3 , heat capacity is equal to 3985 J kg −1 • C −1 , and absorption wavelengths appropriate to chlorophyll (see c I , above); note that this represents below-ice irradiance when ice is present  Q10CopT 5 NCaS Q10NCaT 5 NCaO Q10NCaT 5 EupS Q10EupT 5 EupO Q10Eup 5 Jel Q10JelTe 10 Ben T0benr 5 L P threshold for benthos grazing mg C m −2 Ben (pelagic food) LupP 1 L D Ben (benthic food) LupD 292 Table A7. Feeding preferences matrix, indicating feeding preferences for each predator (columns) on each prey (rows). www.geosci-model-dev.net/13/597/2020/ Geosci. Model Dev., 13, 597-650, 2020  Table A9. Notation and description for input parameters related to respiration. (See Table A6 for pelagic prey preferences and infauna Q 10 parameters, Table A8 for infauna excretion fraction parameters, and Table A5

Appendix B: Light attenuation and limitation in BESTNPZ
Light attenuation in the Bering10K-BESTNPZ model has undergone a variety of modifications between the Hermann et al. (2013Hermann et al. ( , 2016 publications, and between the Hermann et al. (2016) publication and this one. In this section, we clarify the motivations behind these changes and their effects on both biological and physical state variables in the fully coupled model. Light attenuation formulations are used in a typical, physics-only implementation of ROMS in order to determine the attenuation of shortwave radiation in the water column and distribute surface heat fluxes appropriately. When a biological module is added, any light attenuation calculations necessary for photosynthesis (typically focused only on the photosynthetically active wavelengths) are coded separately, independent of the shortwave attenuation code in the physical core of ROMS, and with no direct feedback of the biology on the physics.
The Hermann et al. (2013) version of BESTNPZ followed this default setup; the attenuation coefficient of photosynthetically active radiation for the biological model followed Gibson and Spitz (2011): where C is the concentration of chlorophyll a (mg Chl a m −3 ), k ext is the clear-water attenuation coefficient, and the k A term is attenuation due to light absorption by chlorophyll; the k ext and k A values, as well as the C-term exponent, are derived from an empirical fit of open-ocean chlorophyll concentrations versus measured attenuation (Morel, 1988). The physical model relied on the ROMS default formula, which applies separate attenuation length scales to bluegreen wavelengths and other wavelengths following Paulson and : The values for a frac (unitless), a µ 1 (m), and a µ 2 (m) are set via an internal lookup table based on one of several water classification types. These water type classes account for the varying chlorophyll, sediment, and CDOM concentrations in different types of environments. Early versions of the Bering10K domain alternated between Class I (a frac = 0.58, a µ 1 = 0.35, a µ 2 = 23.0) and Class III (a frac = 0.78, a µ 1 = 1.4, a µ 2 = 7.9) parameters; the Hermann et al. (2013) study used the Class III option. Note that in this setup, the amount of chlorophyll in the water column assumed by the parameters of Eq. (B3) is independent from the chlorophyll levels modeled by the coupled biological model.
A more harmonious approach to light attenuation was adopted in the Hermann et al. (2016) version of the model. As one of several adjustments aimed at addressing a warm bias in heating of the water column, the attenuation equation in the physical model replaced the a µ 2 attenuation length scale parameter with the PAR-related length scale (K −1 ) from the biological model; this provided direct feedback of phytoplankton on heat absorption in the water column and better encompassed the varying light conditions expected between the low-productivity basin and high-productivity shelf region. A rough approximation of attenuation due to sediment was also added as a function of bottom depth (h) in both the biological and physical calculations for light attenuation: Hermann et al. (2016) also increased the value of the k A parameter (from 0.0518 to 0.121) and adjusted the values of a frac and a µ 1 to Class I values (now hardcoded within the source code).
The new biological feedback resulted in only a very small change in water column heat content, and the model warm bias was later addressed through other adjustments to surface boundary conditions. However, the addition of the sediment attenuation term strongly affected the biological calculations. Even in the absence of grazing and under nutrient-replete, peak-irradiance conditions, the Hermann et al. (2016) set of parameters raises the compensation depths for both large and small phytoplankton to very shallow depths; in the shallowest waters of our domain, primary production is only possible in the upper 4-5 m of the water column (Fig. B1). While high sediment attenuation is possible in a few specific locations in the Bering10K domain, near the mouths of the Yukon, Anadyr, and Kuskokwim rivers during peak streamflow, there is no evidence to support such strong sediment shading throughout the entire inner domain.
During early validation of the circa Hermann et al. (2016) model for this study (prior to any adjustments for bug fixes in the code), it was also noted that light was the key limiting factor in the deep basin region. The sediment attenuation term is negligible in this deep-water region, and the compensation depth is reasonable in this location. However, the coarse vertical resolution of the model domain resulted in a surface layer nearly 40 m thick. The discretization used by this model assumes that all processes are approximately linear within a given layer, and that the layer midpoint can be therefore be used to approximate characteristics of the layer. In the case of light levels in these deep basin surface layers, this assumption did not hold, and the resulting light levels at the midpoints of the surface layers were too low to support any significant growth of phytoplankton.
For this study, all parameters related to light were revisited. A new sediment attenuation term was derived empirically from satellite inherent optical property measurements from the region (Fig. B2). This remains a rough es-timate that does not consider the seasonal variability of sediment and detrital matter that contribute to this term but alleviates the previously excessive limitation in shallow water. Attenuation-related parameters for clear-water attenuation and chlorophyll-based attenuation were updated to values consistent with literature (k ext = 0.034 following Morel et al., 2007, and k A = 0.0518 and k B = 0.428 following Morel, 1988). It was noted that the value used for photosynthetically active radiation fraction (PAR frac = 0.5) in the Gibson and Spitz (2011) and Hermann et al. (2013Hermann et al. ( , 2016 publications was higher than most field-based and analytical estimates. We adjusted this parameter to a value of 0.42 based on examination of satellite-derived PAR versus our model's surface boundary condition shortwave radiation values. Finally, the vertical resolution in the physical model was increased from 10 layers to 30 layers to allow for better resolution of mixed layer dynamics and light attenuation in the basin. The updated light attenuation equations and parameters, coupled with the higher vertical resolution in the newer simulations, better reflect the true mechanisms controlling light levels in the Bering Sea; the bathymetry-following artifacts in phytoplankton spatial patterns are no longer present (Fig. B3). However, the lower light limitation in certain parts of the domain has exposed some previously overlooked deficiencies in micro-and macronutrient limitation in the BEST-NPZ model. Observations suggest that the deep basin should be a low-production region, a pattern that is present but for the wrong reasons in the Hermann et al. (2016) version of the model and absent in the updated, more mechanistically sound version of the model. For some applications, it may be more useful to have the correct gradient between on-and offshelf populations, even if it is for the incorrect reason, than to have mechanistically sound light limitation; however, users should be fully aware of this discrepancy before analyzing any biological output from either the earlier Hermann et al. (2016) simulations or this newer set of simulations using the updated code. Figure B1. Compensation depth (depth where photosynthesis balances phytoplankton loss terms) and euphotic depth (depth of 1 % PAR) in three versions of the BESTNPZ model, under low (light-colored) and high (dark-colored) temperature conditions, assuming a surface irradiance of 200 W m −2 and nutrient-replete waters. The "no Chl or grazing" scenario (blue) assumes losses due to respiration and nonpredatory mortality only, with no attenuation due to phytoplankton itself. The "low Chl and grazing" scenario (green) assumes attenuation due to 0.5 mmol Chl a m −3 throughout the water column and adds a constant grazing loss rate of 0.05 d −1 . Gray lines indicate boundaries of the ROMS depth levels in the 10-layer (dark) and 30-layer (light) versions of the model. (Note that values are very nearly identical for small and large phytoplankton due to the proportional scaling of rate parameters between the groups.) Figure B2. Attenuation due to sediment derives from a power law fit of satellite estimations of absorption due to gelbstoff and detrital material (entire mission composite from VIIRS, 2012-2018) versus bottom depth. Subpanels show the same data in linear and logarithmic space. Appendix C: Coupling BESTNPZ to ROMS: compilation options and output variables The following appendix provides instructions for setting up input for, compiling, and retrieving output from a simulation of ROMS coupled to the BESTNPZ biological model. At present, the BESTNPZ code has only been coupled to the Bering10K domain, and a few quirks of its internal coding prevent it from being coupled to other physical domains in its current form. Therefore, this user manual should be considered to be specific to the Bering10K-BESTNPZ ROMS model setup.
The primary input parameters specific to the BESTNPZ are described in Appendix A. The remaining setup options relate to the compilation flags and the indices of input and output state variables, described in the following sections.

C1 Compilation flags
Within the ROMS source code, C preprocessing flags are used to selectively compile the code. Several C preprocessing flags appear throughout the BESTNPZ code; those that currently exist in the master branch of the source code are detailed in the following table.
A few of these compilation options resulted from the gradual addition of new features (such as the addition of benthic and ice state variables) that later became, for all intents and purposes, permanent additions to the BESTNPZ model. These are noted in the table as "recommended to always define". While we have attempted to keep the source code flexible to all compilation options, we rarely test for compilation stability with these undefined, and all recent model validation work assumes these model features are present.

C2 Output variables and input parameter indices
The BESTNPZ module allows for a large number of variables to be added to the various ROMS output files (history, average, and station files).
Regardless of compilation options, the biological tracer variables are saved and available as output variables. Water column tracers are specified via the H/Sout(idTvar) input parameters. The exact tracer variables available, and their positions within the idTvar array, depend on whether the ICE_BIO, BENTHIC, IRON_LIMIT, and JELLY compilation flags are defined; see Table C2 for details. Benthic tracers can be specified for output through the H/Sout(idBvar) input; iBen= = 1 and iDetBen = 2 within this array. The ice variables are turned on and off with the variable-specific input parameters H/Sout(idIcePhL), H/Sout(idIceNO3), and H/Sout(idIceNH4).
If the STATIONARY flag is defined, many more intermediate diagnostic variables are saved internally and available for output. These are controlled by the idTSvar index array to Hout and Sout. Note that when running with a subdivided time step (BioIter > 1), these diagnostic variables will reflect the values calculated during the final subdivision only.
Older versions of the code included a second compilation flag, STATIONARY2 (with the corresponding index array idTS2var), to define two-dimensional stationary diagnostics. More recent versions of the code no longer include 2-D diagnostic variables. However, the code structure is still in place for this if it becomes necessary in the future.
See Table C3 for a comprehensive description of all output variables associated with the BESTNPZ model. The ice-specific use of this flag is not robust to also used in many parts of the ice code as other model domains with their own unique shorthand for "use variables from ice model" (as application flags opposed to analytical one-dimensional ice)

CLIM_ICE_1D
Use   Table C2. Indices in the idTvar array. Note that Hout(idTvar) appears separately in the ocean.in file and BPARNAM file, with the latter including only biological active tracers. The Sout(idTvar) input appears only once, in the STANAME file, and biological tracer indices begin at NAT + 1; NAT is here assumed to be 2, for temperature and salinity. In all cases, note the skip between iPhL and iMZL due to the now-deprecated small microzooplankton group.
Author contributions. KK wrote text, prepared figures, and ran and analyzed simulations for this paper; AH, WC, IO, and KA provided edits and review during manuscript preparation. AH and WC are responsible for initial development and analysis of the physical model (Bering10K) used in these simulations. AH, IO, and KA secured funding to support development of the Bering10K and BESTNPZ models.
Competing interests. The authors declare that they have no conflict of interest.