GNOM v1.0: An optimized steady-state model of the modern marine neodymium cycle

Spatially distant sources of neodymium (Nd) to the ocean that carry different isotopic signatures (εNd) have been shown to trace out major water masses, and have thus been extensively used to study large-scale features of the ocean circulation both past and current. While the global marine Nd cycle is qualitatively well understood, a complete quantitative determination of all its components and mechanisms, such as the magnitude of its sources and the paradoxical conservative behavior of εNd, 5 remains elusive. To make sense of the increasing collection of observational Nd and εNd data, in this model description paper we present and describe the Global Neodymium Ocean Model (GNOM) v1.0, the first inverse model of the global marine biogeochemical cycle of Nd. The GNOM is embedded in a data-constrained steady-state circulation that affords spectacular computational efficiency, which we leverage to perform systematic objective optimization, allowing us to make preliminary estimates of biogeochemical parameters. Owing to its matrix representation, the GNOM model is additionally amenable to 10 novel diagnostics that allow us to investigate open questions about the Nd cycle with unprecedented accuracy. This model is open-source and freely accessible, is written in Julia, and its code is easily understandable and modifiable for further community developments, refinements, and experiments.


Introduction
Rare earth elements (REEs) have long been recognized to provide unique insight into ocean circulation and biogeochemical 15 cycles (e.g., de Baar et al., 1983de Baar et al., , 1985Bertram and Elderfield, 1993;Elderfield, 1988;Elderfield and Greaves, 1982;German et al., 1995;Goldberg et al., 1963;Haley et al., 2014;Høgdahl et al., 1968;Jeandel, 2001, 2004;Piepgras and Jacobsen, 1992;Piper, 1974;Sholkovitz and Schneider, 1991;Zheng et al., 2016). Isotopic variations of neodymium (Nd), in particular, have been extensively used as a tracer of ocean circulation, which plays a fundamental role in Earth's climate over a wide range of timescales, from millennia to millions of years (e.g., Adkins, 2013;van de Flierdt et al., 2016;Frank, 2002; 20 Goldstein and Hemming, 2003;Piepgras and Wasserburg, 1980;Sigman et al., 2010;Tachikawa et al., 2017). isotopes in the ocean and how they impact broader marine biogeochemical cycles. The GEOTRACES Science Plan identified Nd isotopes as a "key trace element or isotope" that is expected to be measured on all GEOTRACES cruises because of its use as a paleoceanographic proxy. Thanks to this international effort, considerable amounts of new Nd concentration and isotope data have been generated in recent years, collected notably in the GEOTRACES Intermediate Data product 2017 (IDP17, Schlitzer et al., 2018), which increased the Nd data inventory by about 50 %, as well as post-IDP17 GEOTRACES data yet to 50 be released in future data products.
To gain the most useful and accurate information from these observations, however, it is paramount to understand their modern ocean biogeochemical cycles and tracer budgets. Neodymium and other REEs enter the ocean via rivers, submarine groundwater discharge, aeolian deposition, pore waters, and/or interaction with sediments ( Fig. 1). Once in the ocean, they are redistributed by the ocean circulation, scavenged by particulate matter, and exit the ocean via sedimentation and incorporation 55  Bertram and Elderfield (1993) Box model 7 boxes 10 2900 εNd only Tachikawa et al. (2003) 10 boxes 60 f 480 εNd only Du et al. (2020) 4 into authigenic ferromanganese oxides (Frank, 2002;Byrne and Kim, 1990;Elderfield, 1988;Elderfield et al., 1981;Elderfield and Sholkovitz, 1987;Sholkovitz et al., 1989Sholkovitz et al., , 1994Sholkovitz et al., , 1992Haley et al., 2004;Blaser et al., 2016;Du et al., 2016). While most sources of REEs to the ocean have likely been identified, there are still large uncertainties associated with the magnitudes of these different fluxes due to the inherent challenges of measuring sources that are temporally variable and globally widespread.
Models of the marine Nd cycle, in conjunction with seawater measurements, offer a way to constrain the magnitudes and 60 isotopic compositions of these various inputs to the ocean and identify the most important sources. Four distinct types of models have been used to simulate the modern ocean Nd cycle: simple box models, ocean general circulation models (OGCMs), steadystate circulation models, and boundary propagation models, each with their strengths and weaknesses (Table 1). Some of these models have explicitly tracked the concentrations of each Nd isotope ( 143 Nd and 144 Nd, thus allowing for estimation of Nd concentration as well as its isotopic composition), while other models have simply tracked ε Nd as a single conservative tracer.
v1.0 is ideal for Nd cycle investigations. Except for the GEOTRACES dataset which must be downloaded manually, the GNOM is self-contained and version-controlled, making it easy to reproduce simulations.
Additionally, the steady-state formulation of the GNOM is amenable to novel Green-function-based diagnostics that can 110 provide important new insights into major open questions on the marine Nd cycle. Green functions (sometimes spelled Green's functions) can be used for solving ordinary differential equations with an initial condition and/or boundary values (see, e.g., Morse et al., 1953). In our case, they can be thought of as the [Nd] responses to unit local sources of Nd and allow us to partition [Nd] or ε Nd into components of interest, such as Nd from a particular source or location. Here, we introduce new partitions of Atlantic Nd and ε Nd (following, e.g., Holzer et al., 2016;Holzer, 2017, 2018;Holzer et al., 2021) that are helpful 115 to disentangling the neodymium paradox (Siddall et al., 2008). We show that we can accurately partition [Nd] and ε Nd in the central Atlantic into contributions from northern-and southern-sourced waters. These preliminary diagnostics already reveal important information. They help quantify the conservativeness of ε Nd along water pathways and unveil underlying mechanisms by evaluating the effects of local sources and sinks. Detailed investigations of these diagnostics are out of the scope of this study and will be carried out in future work using a subsequent version of the GNOM with more finalized parameter values. 120 We invite paleoceanographers and modellers alike to use the GNOM v1.0 model, to improve its implementation, explore its capabilities, and thus contribute to quantitatively answering long-lasting questions on the Nd cycle.

The GNOM Model
Neodymium concentrations are controlled by the interplay between circulation, external sources, and reversible scavenging and burial in the sediments (Fig. 1). These components completely define the state of the Nd cycle in our Global Neodymium where χ mod Nd is the modelled Nd concentration vector, T circ is the OCIM v2.0 advection-diffusion operator or transport matrix,

130
T scav is the reversible-scavenging matrix, and the s k are the external sources of neodymium. Note that χ mod Nd and s k are 200160element column vectors and that T circ and T circ are sparse 200160×200160 matrices such that the linear system represented by Eq.
(2) can be solved in a few seconds on a modern laptop via LU factorization and forward and backward substitution (often referred to as "matrix inversion").
The global ε Nd distribution is determined by both the distribution of 143 Nd and 144 Nd. Following, e.g., John et al. (2020) Figure 3. Extent of the dust regions of origin (Kok et al., 2021a, b;Adebiyi et al., 2020) and their εNd values as optimized in GNOM v1.0.
See text for region names.

Aeolian dust
We assume that atmospheric dust deposition injects Nd in the surface ocean only. That is, soluble Nd from dust is instantly released as dissolved Nd in the top layer of the model grid. Although it can vary with location and mineralogy (Goldstein et al., 1984), for simplicity, we assume a constant dust Nd content of (Nd:dust) = 40 µg g −1 (which is within the 11.93 to 45.76 ppm 165 range of atmospheric dust observations of Goldstein et al. (1984)). The spatial pattern of the dust source is prescribed by an atmospheric model output (Scanza et al., 2018) and is shown in Fig. 2a.
The isotopic signature of atmospheric mineral dust deposited on the ocean surface is not homogeneous (Goldstein et al., 1984). Instead, dust ε Nd varies with composition and mineralogy, which derives from its land origin. It is also likely that Nd solubility varies with composition and mineralogy. Thus, the GNOM v1.0 uses nine separate annual mineral dust deposition 170 fields (dataset available from Adebiyi et al., 2020) from nine different regions. These dust deposition fields were generated by Kok et al. (2021a) and Kok et al. (2021b) who partitioned dust emissions according to nine different regions of origin, using the global climate model of Scanza et al. (2018). The nine regions we use are North-Western Africa (NWAf), North-Eastern Africa (NEAf), Southern Sahara and Sahel (Sahel), Middle East and Central Asia (MECA), East Asia (EAsia), North America (NAm), Australia (Aus), South America (SAm), and Southern Africa (SAf). Figure 3 shows the extent of these regions. 175 We assign a distinct Nd solubility and isotopic signature to each region of origin, controlled by the 2 × 9 corresponding parameters (denoted β r and ε r for each region r; see parameters Table 2). The dust source of Nd into the ocean is hence given by where φ dust,r is the dust deposition flux from region r taken from the Adebiyi et al. (2020) dataset and rearranged into a acts like a mask so that s dust only injects Nd in the top layer of the model grid.) Each isotopic signature parameter ε r uniquely defines the isotopic ratio of each region via R r = R CHUR (ε r + 1), which is then used to compute the dust source for the RNd tracer via This allows for the aeolian dust source to carry an elaborate and more realistic isotopic signature than previous models (Fig. 2g).
Figure 3 also shows the optimized ε r values of each region.

Volcanic ash
Despite a smaller atmospheric loading than mineral dust, we include volcanic ash as a separate, potentially important, aeolian 190 source of Nd because of its typically high reactivity and solubility compared to mineral dust. This reactivity partly reflects the high surface area of volcanic ash and the thermodynamic instability of volcanic glass (Gaillardet et al., 1999;Dessert et al., 2003). We use the geographic pattern of volcanic ash deposition as used in the work of Chien et al. (2016) and Brahney et al. (2015), which provides estimates of the global deposition fields of dust and soluble iron from different aerosol types (mineral dust, volcanic ash, combustion fire, and so on). Assuming a constant neodymium content identical to dust, the volcanic-ash 195 source of Nd into the ocean is thus given by where φ volc is the column vector of the volcanic ash deposition flux from the Chien et al. (2016) dataset and β volc is the Nd solubility in volcanic ash. Similarly to the dust-source formulation, the magnitude of the volcanic-ash source of RNd is controlled by the parameter ε volc , where R volc = R CHUR (ε volc + 1). (Note that R volc = R volc everywhere because the volcanic-ash source comprises a single term, unlike the region-of-origin-partitioned dust source.) The geographical patterns of the volcanic-ash source and its uniform isotopic signature are shown in Fig. 2b and h.

215
There is no established quantitative flux model for sedimentary Nd release that works on the global scale, especially given the limited spatial coverage of direct sedimentary flux measurements (which are almost entirely restricted to the coastal northwest Pacific). Therefore, the GNOM v1.0 implements the sedimentary Nd flux into the ocean as a flexible and optimizable function of depth z and local sedimentary ε Nd . The "base" sedimentary Nd flux, φ(z), is modelled as an exponential function of depth, 220 where φ 0 , φ ∞ , and z 0 are optimizable parameters. The rationale behind the parameterization of Eq. (7) is versatility. For φ ∞ < φ 0 and small z 0 , the flux profile is larger near the surface and smaller in the deepest parts of the ocean, while for φ ∞ > φ 0 and large z 0 , the sedimentary flux increases quasi-linearly with depth (as in, e.g., Du et al., 2020, Fig. 1C). The optimization only enforces weak direct constraints on these parameters, allowing for any such profile shape. The optimized base Nd flux profile as a function of depth is shown in Fig. 4c.

225
The base sedimentary flux is further scaled by a reactivity factor, α, which controls the effective sedimentary Nd release.
Sediment reactivity is modelled via a simple parameterized quadratic function of local sedimentary ε Nd , where the optimizable parameters a and c control the curvature and minimum of the quadratic, respectively, while ε 10 = 10 ‱ often associated with rather fresh, and thus reactive, detrital material. We emphasize that this enhancement can be turned "on" 235 or "off" depending on the choice of parameters (a = 0 turns it off). However, maybe coincidentally, extremely high ε Nd values are generally associated with relatively young volcanic Nd that is more reactive and readily soluble (Lacan and Jeandel, 2005;Pearce et al., 2013;Wilson et al., 2013;Blaser et al., 2016Blaser et al., , 2020 and previous model studies have resorted to different enhanced Nd release parameterizations to achieve a similar effect (see, e.g., Pöppelmeier et al., 2020). The same is not necessarily true for rocks with extremely low ε Nd values, however it so happens that much of the region around the Labrador Sea (Greenland 240 and northern Canada) is currently, or was previously, glaciated, which has resulted in a large amount of fine-grained crystalline (and thus labile) detritus with extremely negative ε Nd (von Blanckenburg and Nägler, 2001). The quadratic function α(ε Nd ) is shown in Fig. 4a and the resulting scaling factor for the global map is show in Fig. 4e.  Finally, to account for large glaciers that may produce fine-grained glacial flour from previously unexposed bedrock that likely contains reactive Nd (von Blanckenburg and Nägler, 2001), we additionally scale Nd release from the sedimentary 245 source along the coast of Greenland by a factor α GRL . (For simplicity, we did not account for potentially enhanced Nd release in Antarctic because we assume that extreme ε Nd released by sediments in the Antarctic would be relatively rapidly mixed along the circumpolar current.) Combined with the reactivity α, the resulting sedimentary source is given by where α GRL is the vector equal to α GRL for grid cells against the coast of Greenland and equal to 1 otherwise, ε sed is the column with a −5 ‱ minimum in the Pacific north of 40 • S). For grid cells with heterogeneous ε Nd in sediments, our quadratic implementation of the reactivity α as a function of ε Nd implies a statistical "shift" of the mean released ε Nd towards extreme values because extremely light or heavy Nd is released more efficiently. Assuming that ε sed represents the observed in situ mean of a normally distributed ε Nd sediment composition in each grid cell and a uniform global standard deviation σ ε within each grid cell, the ε Nd that is effectively released at any location, denoted ε eff sed , is given by The difference between ε eff sed and in situ ε sed is shown in Fig. 4b to shift ε Nd values by up to about ±0.03 ‱. In other words, for in situ ε Nd values lower than the minimum of α (dashed gray line in Fig. 4a, b), the released ε Nd value is pushed toward even lower values, and for in situ ε Nd values greater than the minimum of α, released ε Nd is pushed toward even larger values.
(The derivation of Eq. (10) is given in Appendix C). Applying R eff sed = R CHUR (ε eff sed + 1) gives the sedimentary source of RNd 265 as R eff sed s sed .

Rivers
For riverine sources, we use the Global River Flow and Continental Discharge Dataset (Dai, 2017), originally described by Dai and Trenberth (2002) and later updated by Dai et al. (2009) and Dai (2016). This dataset provides an estimate of the volumetric flow rate of the 200 largest rivers on Earth. As a simplification, and to reduce the total number of free parameters 270 in the model, we assume that all rivers share the same Nd concentration c river , which is the parameter that controls the global riverine source magnitude (see Table 2). (Future improvements of the GNOM could include optimizable [Nd] parameters for each individual major river, constrained by ranges based on observations.) Because the GNOM v1.0 does not resolve estuary removal processes, our c river is to be understood as an effective Nd concentration that implicitly accounts for Nd removal in estuaries and is thus the concentration that makes it into the ocean. Hence, the vector of the riverine source is given by where Q river is the vector of the volumetric flow rates of the rivers from the dataset of Dai (2017) gridded (cumulatively) onto the OCIM v2.0 grid and where v is the vector of the volumes of the model grid cells. Note that in order to prevent numerical noise from the large gradients caused by the discrete nature of their distribution, we additionally artificially spatially smooth out the riverine sources by spreading it over neighboring grid boxes (see Fig. 2d).

280
Riverine ε Nd values are taken from the global map of interpolated sedimentary ε Nd by Robinson et al. (2021), which we also use for the sedimentary source, so that the RNd riverine source is given by R eff sed s river . We note that the sedimentary ε Nd map of Robinson et al. (2021) overlays the nearest continental ε Nd signal where sediment thickness is more than 1 km, such that the ε Nd of the GNOM v1.0 riverine sources are mostly from continental measurements that lie within or close to the river drainage basins. Riverine ε Nd values are shown in Fig. 2j.

Groundwater
Neodymium also enters the oceans via coastal groundwater (Johannesson and Burdige, 2007). We use the coastal submarine and terrestrial groundwater discharge dataset of Luijendijk et al. (2019), described by Luijendijk et al. (2020), which provides the location and volumetric flow rate of 40,082 coastal watersheds. Similarly to the riverine sources, we assume that [Nd] is constant across river watersheds and implicitly accounts for local Nd removal processes. The single parameter c gw is thus 290 the effective groundwater concentration that makes it into the ocean and controls the global magnitude of the GNOM v1.0 groundwater Nd source (see Table 2). The groundwater Nd source is given by where Q gw is the groundwater volumetric flow rates from the dataset of Luijendijk et al. (2020), gridded cumulatively onto the GNOM grid. The pattern of s gw is shown in Fig. 2e.

Hydrothermal vents
A minor fraction of the marine neodymium budget presumably comes from hydrothermal vents, which deliver likely young Nd (high ε Nd ) along the mid-ocean ridges (Piepgras and Wasserburg, 1985;Stichel et al., 2018). Here, we assume that the release of hydrothermal Nd is proportional to that of helium. For consistency, we use the mantle helium source field that was used in the data assimilation of the OCIM v2.0 (DeVries and Holzer, 2019). The global magnitude and ε Nd of the hydrothermal Nd 305 source are set by parameters σ hydro and ε hydro , respectively (see Table 2), with where s He is the vector of the 3 He mantle source and v T s He is its global magnitude, i.e., its volume integral, used here for normalization. (One can easily check that v T s hydro = σ hydro .) The hydrothermal source of RNd is simply given by R hydro s hydro = Fig. 2f and l show the spatial distribution of the hydrothermal Nd source and its ε Nd , respectively.

310
Arguably, the hydrothermal system as a whole acts as a net sink of Nd in the ocean (Stichel et al., 2018). As described in Section 2.3, the GNOM v1.0 does not include a parameterization of scavenging due to hydrothermal particles. Future versions of the GNOM should attempt to include such a removal process in order to properly balance the hydrothermal source and allow the ε Nd signature to be modified along hydrothermal vents without increasing the [Nd] concentration at the same time.
Neodymium is removed from the system through scavenging onto particles. We follow, e.g., Bacon and Anderson (1982), Siddall et al. (2008), and Arsouze et al. (2009), and assume that dissolved and scavenged Nd are exchanged via a first-order kinetic reaction, where X is a given particle type. The rate of change of [Nd] in reaction (R1) can be written as with equilibrium constant We further assume that each scavenging particle type X has a constant settling velocity w X that dominates the transport rates of the ocean circulation. For each particle type X, we construct its flux divergence operator, denoted T X , such that T X x 325 is the discrete equivalent of ∇ · (w X [X]) (where x is the particulate concentration vector and w X is the downward threedimensional settling velocity vector). We use T X to compute the rate at which reversible scavenging adds or removes Nd in each grid box.
However, we avoid the explicit simulation of scavenged neodymium, XNd, by having a fraction of the dissolved neodymium pool sink to the box below as if it were adsorbed onto a falling particle. To do this in practice, we take advantage of the direct [X] (from the four particle fields included in the GNOM v1.0, described below). Consequently, the corresponding partial 335 downward flux of dissolved Nd is given by w X [XNd] where w X is the settling velocity of particle X. We further assume that a fraction f X of the scavenged Nd that reaches the seafloor is removed from the system, providing a net sink for our model.
(Note that this is the same implicit approach as in the AWESOME OCIM (John et al., 2020).) We consider four different particle types for scavenging Nd. This fourth scavenging mechanism effectively behaves like spontaneous precipitation, and, as such, will be referred to as "precipitation" throughout this study (subscript "prec"). Precipitation is implemented by using a spatially uniform fictitious particle concentration of 1 mol m −3 that settles with velocity w prec = 0.7 km yr −1 . We note that while this additional particleindependent scavenging sink could compensate for additional types of particles not currently implemented in the model, it 350 is likely that more scavenging particle types are required for an accurate representation of the Nd cycle. These include hydrothermal particles (which should result in hydrothermal systems being a net sink; Stichel et al., 2018) and iron-manganese oxides (which are potentially the most important scavenging particles; Lagarde et al., 2020;Schijf et al., 2015;Sholkovitz et al., 1994). Overall, the scavenging transport operator is thus defined by summing the flux divergence for all particle types and using Eq. (15), i.e., where D X is a diagonal matrix with diagonal x, the vector of the concentrations of particle type X. Hence, for each scavenging particle type X, the corresponding scavenging rate and downward transport is controlled by the concentration [X], the 360 equilibrium constant K X , the settling velocity w X , and the burial fraction f X .

Optimization
The output of our model is governed by a set of 43 free parameters that control the magnitude and ε Nd of each source as well as the reversible scavenging and burial rates. The computational speed afforded by the model implementation allows us to jointly optimize almost all of these parameters by minimizing the mismatch of modelled and observed [Nd] and ε Nd values. This is 365 done in practice by minimizing an objective functionf (p) that quantifies the mismatch between model and observations, for a given set of parameters p.

Objective function
The mismatch with each observation is quantified by the square of the difference between the observed value and the modelled value from the closest grid cell. Because we use observations of [Nd] and ε Nd , the mismatch function, denoted f , depends on 370 the 3D fields of the two modelled tracers (χ mod Nd and χ mod RNd ). We also include an additional cost for parameter values themselves. The mismatch function is defined by where the first term represents the normalized mismatch between modelled and observed [Nd], the second term represents the normalized mismatch between modelled and observed ε Nd , and the last term represents the inverse of the likelihood of the model parameters. We detail each term below.
In Eq. (17) (Technically, the location of the observed and modelled value being compared may not match, in which case we use the closest "wet" model grid cell using a nearest-neighbor algorithm.) We use the same approach for ε Nd , by comparing the modelled vector ε mod Nd to the observed vector ε obs Nd at the locations r ∈ O εNd of each ε Nd observation. The ω Nd and ω εNd values, fixed at 1 in this study, control the relative contributions of the mismatches in Nd concentrations and ε Nd values. Given these, an error of 1 ‱ in ε Nd weighs 385 the same as an error of about 4.5 pM.
The third term of Eq. (17) adds a direct penalty constraint on the parameters to prevent them from reaching unrealistic values.
If any parameter reaches a value close to the limits we impose in the model, this third term will grow large; since the algorithm tries to minimize Eq. (17), it will push that parameter back to a more acceptable value. For each parameter p i we prescribe a realistic domain D i and an initial guess (see Table 2) that we use to determine a reasonable prior probability distribution d i 390 and to randomize the initial parameter values. Specifically, each parameter with a semi-infinite range (from 0 to ∞) is given a log-normal prior of which the logarithm has mean equal to the logarithm of the initial guess and has variance equal to 1.
Each parameter with a finite range is given a logit-normal prior that is scaled and shifted such that its support matches the range D i exactly and such that the initial guess equals the median of d i . For example, the φ 0 parameter for the sedimentary flux at the surface is given the (0, ∞) range and an initial guess of 20 pmol cm −2 yr −1 . Taken as a random variable, the prior 395 distribution of φ 0 is the log-normal distribution such that log(φ 0 /(pmol cm −2 yr −1 )) ∼ N (log(20), 1). (N (µ, σ 2 ) denotes the normal distribution with mean µ and variance σ 2 .) For the performance and robustness of the optimization, we additionally perform a variable transform λ i on each parameter using a bijection from the parameter domain D i to (−∞, ∞). This variable transform prevents parameters from reaching beyond their prescribed ranges. We also carefully chose the bijection such that the prior distribution is normally 400 distributed in the transformed parameter space, and thus incurs an inverse log-likelihood that is quadratic, a property that benefits the performance of the optimization. In the case of the parameter φ 0 , which is transformed via the bijection λ i : φ 0 → log(φ 0 /(pmol cm −2 yr −1 )), the corresponding transformed random variable is normally distributed by construction.
For bounded parameters, such as β volc , a shifted and scaled logit transform is applied, which also yields a transformed random variable that is normally distributed by construction.

405
The ω p value, fixed at 10 −4 in this study, controls the relative size of the penalty for the parameters compared to the cost of Nd and ε Nd . It is chosen such that the [Nd] and ε Nd mismatch costs are generally about 2 to 3 orders of magnitude larger than the parameter penalty (although there is no bound on the parameter penalty for extreme parameter values). The primary role of this added parameter cost and the associated variable transform is to improve the convergence rate of the optimization and help prevent it from "getting stuck" in valleys of parameter space (see, e.g., Wright et al., 1999).
The objective function depends on p only and is defined bŷ where we have explicitly marked χ mod Nd (p) and ε mod Nd (p) as functions of the parameters p. That is, for any choice of parameters p, before evaluating the model mismatch as quantified by the objective function, we must first compute the vectors χ mod Nd and χ mod RNd by solving for the steady-state solution to Eq.

Dissolved neodymium and ε Nd data
The

Minimization algorithm
We use the Newton Trust Region algorithm from the Optim.jl package (Mogensen and Riseth, 2018) to minimize the objective 430 functionf (p). This requires, at every iteration, the objective function, its gradient, and its Hessian, which are evaluated using the F1Method.jl and AIBECS.jl packages (Pasquier, 2020a;Pasquier et al., 2022b;Pasquier, 2020b).
Thanks to the computationally efficient gradient optimization algorithm that leverages gradient and Hessian information, the entire optimization run takes a few hours on a modern laptop. In our experience, for comparison, using the more standard finite-differences approach or an optimization algorithm that does not have access to derivatives would likely take multiple 435 months.
Because the Newton Trust Region algorithm performs local rather than global optimization, we run multiple optimizations starting from randomized initial parameter values sampled from the parameter distributions d i (as shown in Fig. B1). Although not all optimization runs end up in the same state because of many local minima, we find that most of them converge towards similar solutions with a similarly small objective-function value, which we denote as our "best" estimate, and out of which all 3 Results

Parameter values
Our "best" estimate of the state of the Nd cycle is given by the set of parameters that minimizes the objective function defined 445 in Eqs. (17)-(18). We emphasize that our estimate is determined by a local minimum of a specific parameterization, such that "best" here is somewhat subjective. In all likelihood, there exist other models and other parameter choices which produce a similar fit to global observations, though we expect all such models to capture the same key features of global Nd biogeochemical cycling. Initial guesses and optimizable ranges for each parameter were determined from the literature and the expertise of authors. Initial guesses and final parameter values, along with unit, prescribed range, and a brief description, are given in 450 Table 2. Parameter prior distributions, their randomized initial values, and final optimized values are shown in Fig. B1.
In Table 2, parameters without a range indicate that they were not optimized and held fixed at given previous model or literature values. For example, in the case of scavenging by each particle type X, we only optimized K X and f X (not w X ).
As described in Section 2.3 above, the settling velocities for POC and bSi are not optimized and are instead fixed to match the values of their respective parent offline models. While there are no parent models for dust and precipitation, we do not optimize 455 the corresponding settling velocities for these particle types either, because K X and w X can perfectly compensate each other.
For example, doubling K X while halving w X has no effect on Nd distributions and the objective function. Only their product,  Darker colors indicate high density of data, such that n % of the modelled and observed data lie outside of the n-th percentile contour. The closer the darker contours are to the 1:1 black dashed line the better the fit.
K X w X , which sets the strength of the "scavenging pump" through the operator matrix T scav , appears in the tracer equations (see Eq. (16) or, e.g., John et al., 2020), such that these parameters cannot be easily optimized independently.

Fit to observations 460
The general fit to observations is illustrated in Fig. 7, which shows the percentiles of the cumulative joint probability distribution While statistics such as Fig. 7 provide important information at a quick glance, they do not retain any geographical infor-465 mation, so that a more detailed investigation is required to fully assess the model's skill. Indeed, the deviations shown by [Nd] and ε Nd clusters slightly off the 1:1 line (Fig. 7) likely reflect groups of geographically proximate data points that may be symptomatic of systematic biases, which must be analyzed in further detail.

Diagnostics
One of the biggest advances in the GNOM v1.0, compared to earlier models of the marine Nd cycle, is due to the steadystate matrix formulation of the model, which allows to compute advanced and detailed diagnostics that can directly address 515 fundamental questions about the distribution of tracers and better understand their cycle. In the following sections we showcase a few such diagnostics.

Source magnitudes
The optimized parameters determine the magnitude of the sources, which are collected in Table 3. In our best estimate, about 66 Mmol of Nd (or about 9, 500 metric tons) are injected into the global ocean every year. This falls slightly above the 38-520 57 Mmol yr −1 range of previous GCM models (see Table 1 for model references).  (Table 1) and than the 4.4 Mmol yr −1 estimate of Greaves et al. (1994). This is likely 525 due to our optimization procedure, during which Nd solubility is allowed to be adjusted within the whole 0-100 % range,   compared to previous GCM-based studies that typically use a fixed 2 % solubility (Arsouze et al., 2009;Gu et al., 2019;Pöppelmeier et al., 2020). This is despite the initial guesses for β r values for dust set at 5 %, which penalizes large solubilities more than low solubilities (see the increased probability densities for low solubilities in Fig. B1a and B1b). The high optimized solubility of volcanic ash β volc is also likely unrealistic, although the total contribution of volcanic ash is much smaller than 530 mineral dust. We note that generally worse fits to [Nd] and ε Nd observations have been achieved with some of our optimization runs ending in distinct local minima with significantly smaller dust solubilities. (We do not show these worse mismatches but we show the corresponding initial and final parameter values in Fig. B1.) Finally, we emphasize that it is not the goal of this manuscript to establish estimates of the GNOM parameters and that we welcome future GNOM users to apply narrower ranges for those parameters for which they have better constraints (for example, restricting dust Nd solubilities to values below 10 %).
The third largest source is riverine, with about 10 Mmol yr −1 , also in accord with the published 1.8-12.4 Mmol yr −1 range in previous models (Table 1) and similar to the 4.6-12 Mmol yr −1 values from Goldstein and Jacobsen (1987) and Greaves et al. (1994). The GNOM optimization did not favor a large source from submarine groundwater discharge, which has been estimated 540 between 29-81 Mmol yr −1 by Johannesson and Burdige (2007), but this source was not included in any other modelling studies so it is difficult to compare with other global model estimates. Hence, apart from a relatively elevated dust source, the GNOM optimization generally supports previous estimates of the magnitude of these sources.

Partition according to source type
We partition Nd concentration according to source type simply by removing all the other sources. (The superposition principle 545 applies directly to our model because it is linear in Nd; see, e.g., Holzer et al., 2016, who partitioned dissolved iron concentrations, first requiring the construction of a linear equivalent model.). With Nd k denoting neodymium that was injected by source type k, its corresponding column vector is thus computed by solving where H = T circ + T scav . Taking the global volume-weighted mean of each χ mod Nd k gives the contribution of each source type to 550 the total Nd inventory and are collected in Table 3.
As is the case for most global biogeochemical cycles, the relative source magnitudes and their relative contribution to the standing stock do not necessarily match. For instance, mineral dust, volcanic ash, and hydrothermal vents contribute more to the mean Nd concentration than their relative source magnitudes. These variations can be directly linked to the bulk residence time of Nd molecules, which vary with location of injection and consequently with source type. 555

Bulk residence times
The bulk residence time of Nd k is given by taking the ratio of its inventory to its source magnitude. Total Nd (i.e., from all sources) has a bulk residence time of roughly 440 yr, which is within the 350 to 2900 yr range of previous Nd-cycling models.
(The bulk residence time for GNOM falls slightly below the 560 to 750 yr range of GCM-based models; see Table 1).
Unlike in the real ocean, where each molecule of Nd is indistinguishable from the next (with no information about its initial 560 source), in a model we can track Nd coming from different sources and calculate source-partitioned residence times. We find that sedimentary-sourced Nd has the shortest residence time at 370 yr. This is because it is injected just above the seafloor and is thus buried more quickly (i.e., a molecule of Nd sourced from the sediments at 5000 m, which only has to fall a few meters to be scavenged back to the sediments, leaves the ocean quicker than a molecule near the surface sourced from dust, which has to fall 1000s of meters). In comparison, volcanic-ash Nd, most of which is deposited onto the surface of the Pacific, remains 565 on average 650 yr in the system (i.e., about 75 % longer than sedimentary-sourced Nd). Mineral-dust deposited Nd as well as riverine and surface groundwater Nd all show a residence time of about 550 yr.

Tracking Nd to investigate ε Nd conservativeness
Sediment-core records of ε Nd are of considerable importance for paleoceanography because they serve as a fingerprint of past ocean circulation and have been used, in particular, to infer changes in the Atlantic meridional overturning circulation (AMOC) 570 (e.g., Palmer and Elderfield, 1985;Rutberg et al., 2000;Piotrowski et al., 2004Piotrowski et al., , 2005van de Flierdt et al., 2006;Piotrowski et al., 2008;Pena et al., 2013;Pena and Goldstein, 2014;Huang et al., 2014;Kim et al., 2021;Pöppelmeier et al., 2021;Hines et al., in revision). Observations of modern ocean ε Nd have thus been extensively used to estimate Atlantic water mass fractions, typically inferred from North-South end-members and assuming quasi-conservative transport and mixing of ε Nd as an ocean circulation tracer (Hartman, 2015;Wu, 2019;Wu et al., submitted).

575
Our GNOM model-or, more precisely, the steady-state matrix schema with which it is built-allows for exact computations of these water-masses and the contributions of various sources and regions to the modern-ocean Nd and ε Nd distributions. We emphasize that by "exact", here, we do not mean exact for the real ocean, but instead for our given choice of ocean-circulation model (in this case the OCIM v2.0, DeVries and Holzer, 2019), which is arguably the best steady-state ocean-circulation model available for such climatological estimates (see, e.g. John et al., 2020). Here, we merely showcase these diagnostics within 580 GNOM, but in the future we plan to further explore the underlying scientific questions that these diagnostics can address. We chose two simple regions that cover the entire ocean except for the central Atlantic between 30 • S and 40 • N. We denote these regions by Ω N and Ω S such that Ω N borders the northern Atlantic and Ω S the southern Atlantic. The regions are shown in Fig. 12.
Firstly, we track Nd concentrations from each of these regions. Neodymium concentrations are not conservative, in part due 585 to reversible scavenging and in part due to external sources that inject new Nd along transport pathways. For example, we can track Nd that came from region Ω N by "tagging" Nd that comes into contact with Ω N and removing that tag when Nd enters Ω S .
Mathematically, we can perform this tagging/untagging by simulating a fictitious tracer, denoted Nd N-tag , for which we enforce a concentration equal to simulated [Nd] in region Ω N , a concentration equal to zero in Ω S , and allowing reversible scavenging and burial to remove Nd N-tag in between. In practice, we compute the corresponding column vector χ NdN-tag by solving  Figure 13 shows the Atlantic zonal averages of [Nd N-tag ], [Nd S-tag ], and "non-tagged" [Nd] (i.e., Nd that was injected in the central Atlantic betwen 30 • S and 40 • N). In the Ω N region (north of 40 • N), 100 % of Nd is tagged as Nd N-tag (Fig. 13a).
Similarly, 100 % of Nd is Nd S-tag in Ω S (Fig. 13b). Figure 13c shows the remaining non-tagged Nd and Fig. 13d combines panels (a-c) by showing only the dominant fraction. A clear signal of the influence of surface sources is visible down to 600 about 1500 m, confining the dominance of the N-and S-tagged fractions of Nd to deep waters and to high latitudes, close to their respective tagging regions. In the future, we intend to further explore the extent of the influence of high latitudes on the distribution of Nd. Gu et al. (2020) performed a North-South end-member partitioning using the CESM model and its POP2 circulation. They quantified water-mass fractions using dye injections at the surface and compared them with water-mass reconstructions from 605 deep ε Nd values. Here we present more detailed partitions that have never been estimated in previous modelling studies to our knowledge.
We now track ε Nd as if it were a conservative tracer, from the Ω N and Ω S regions. Technically, this is done by tracking water itself as it leaves either region. Water-mass fractions estimated with this approach can be used to directly "propagate" modelled ε Nd values from the boundaries of the North and South Atlantic regions to provide an exact end-member mixing estimate that 610 serves as a reference.
Water-mass fractions have been estimated using a Green-function boundary propagator in similar model contexts (e.g., Holzer and Hall, 2008;Primeau, 2005;Holzer and Primeau, 2008). They can be used to calculate the conservative transport of any tracer. As an illustrative example, here, we propagate modelled ε Nd simultaneously from both the Ω N and Ω S regions into the central Atlantic. This theoretical conservative value is denoted by ε Ω Nd (with Ω as a superscript to denote that its value is 615 entirely determined by the ε Nd inside the Ω N and Ω S regions) and the corresponding column vector ε Ω Nd is computed by solving Like Eq. (20), Eq. (21) effectively enforces that ε Ω Nd matches modelled ε Nd inside Ω N and Ω S but is conservatively propagated by the advective-diffusive transport operator T circ outside of Ω. (Note the different operators on both the left-hand side and 620 right-hand sides between Eqs. (20) and (21).) Atlantic zonal averages of modelled ε Nd , conservatively-propagated ε Nd , and their difference (∆(ε Nd )) are shown in Fig. 14. Figure 14c, in particular, shows where ε Nd behaves conservatively (white) and where it does not (green or pink), in our model. In accord with Fig. 13, ε Nd is the least conservative close to the surface, where most of the Nd was never in contact with Ω S or Ω N and was instead injected in the mid-latitude Atlantic. This surface overestimate is most pronounced away from Ω N and Ω S with ∆(ε Nd ) values of up to +10 ‱. Going from 200 to 1000 m depth, we find 625 ∆(ε Nd ) values decreasing from +5 to +1 ‱. Conservative ε Nd and true ε Nd remain within 1 ‱ of each other below 1000 m depth, despite a slight ∆(ε Nd ) overestimate near the seafloor (likely due to the effect of local sedimentary flux) and a slight underestimate around 1, 500 m (potentially due to reversible scavenging). We intend to investigate the distinct conservativeness of Nd and ε Nd (the "neodymium paradox") further in future work. Regions are shown in Fig. 12.
The most prominent caveat of GNOM v1.0 is the steady-state assumption, which we apply to both the circulation and the Nd cycle. Hence, by construction, daily, seasonal, decadal, or multi-decadal fluctuations that deviate from the climatological mean cannot be captured by our model. However, we trust that the circulation model used (the OCIM v2.0; DeVries and Holzer, 2019), which is data-assimilated with ventilation tracers, captures the predominant features and pathways of the modern ocean circulation and provides the most realistic steady-state transport to date (e.g., DeVries and Holzer, 2019;John et al., 2020). 635 We note that compared to previous modelling studies, the GNOM does not represent scavenging by calcium carbonate (CaCO 3 ) because there is no publicly available particulate CaCO 3 field to the best of our knowledge. Modelling scavenging is a challenging task that the GNOM model does not pretend to achieve with high accuracy. However, we deem the current implementation satisfactory considering the quality of the overall model-observation fit. Future versions of the GNOM could include CaCO 3 -particle scavenging or a generally improved scavenging parameterization. Our choice of a local Newton-Trust region optimizer also exposes our strategy to the risk of getting stuck in local minima.

655
To counter that risk, we have opted for the traditional ad-hoc counter mesaure of running the optimization from multiple, randomized initial guesses. While this offers the advantage of spanning a large number of cases, it offers no guarantees that the optimal set of parameters represents a global minimum. It is thus entirely likely that a "better" set of parameter values would reduce the objective-function value and improve the model skill, though again we note that the most important features of Nd biogeochemical cycling are likely to converge to similar values regardless of initial conditions.

660
Our specific choice of objective function gives a measure of the skill of the model for reproducing [Nd] and ε Nd observations.
Roughly speaking, despite our arbitrary choices for the weights ω Nd , ω εNd , and ω p involved in Eq. (17), the value of the objective function can also be interpreted as the negative log-likelihood of observing the measured ε Nd and [Nd] given the model and its parameters. In the future, we plan to determine the values of hyperparameters such as ω Nd , ω εNd , and ω p through more formal Bayesian approach.

665
Qualitatively, GNOM compares well to previous models that are embedded in ocean general circulation models and simulate two explicit Nd isotopes (Arsouze et al., 2009;Rempfer et al., 2011;Gu et al., 2019;Pöppelmeier et al., 2020). Given that these models were built with a different objective, less available data, and without a systematic optimization of all parameters, we expect GNOM v1.0 to perform better against the objective function used in this study. However, we emphasize that these previous models are more suited to specific experiments, including simulations of transient changes in circulation on millennial 670 timescales. In other words, these previous models are not all restricted by the caveats of a coarse steady-state circulation. We also note that the underlying circulation used in GNOM can be swapped with any other circulation available through the AIBECS.jl framework (see Pasquier et al., 2022b), although we reiterate that the OCIM v2.0 likely offers the best available representation of the current ocean circulation.
The main advantages of the GNOM are skill and computational efficiency. The GNOM v1.0 owes its low computational cost 675 and quick simulation time to the steady-state OCIM v2.0 circulation in which it is embedded and the linear representation that allows us to solve the system of tracer equations in a single matrix inversion. The model's skill comes from the optimization procedure and likely benefits from the quality of the OCIM v2.0 circulation. The GNOM is also versatile in many respects owing to its simplicity. Parameter values and acceptable ranges can be tuned, entire mechanisms can be turned off, eliminating free parameters, or added with a few changes of simple lines of code. This versatility is compounded by computational speed, 680 which makes GNOM ideally-suited for quick experimentation and further optimization. The GNOM model is also easily diagnosed, owing to the powerful tools of linear algebra. Novel diagnostics offer new insights by revealing features often hidden in standard model output. Finally, as it is available in a self-contained package (except for the GEOTRACES dataset, which is not programmatically accessible), the GNOM v1.0 offers unprecedented reproducibility, which is sorely lacking in advanced research (Peng, 2011;Irving, 2016). Thanks to the advantages listed above, the GNOM is well positioned for 685 answering long-lasting questions and exploring new ideas about the Nd cycle and the ε Nd distributions.
Code and data availability.

690
The code is written Julia, which is itself free and open-source (Bezanson et al., 2017). The Julia version that was used for this study is v1.6.2.
Except for the GEOTRACES IDP17 data, which must be downloaded manually (see www.geotraces.org), the entirety of the data used in this study can be programmatically downloaded by the GNOM code (also explained in the GNOM documentation) and are all available through the OCIM v2.0 circulations by DeVries and Holzer (2019) (with original files available at https://tdevries.eri.ucsb.edu/models-and-data-products/) the two-dimensional dust deposition fields partitioned according to region of origin from Kok et al. (2021a) and Kok et al. (2021b) (original files available from Adebiyi et al., 2020) the aerosol-type partitioned dust source that includes volcanic ash as used by Chien et al. (2016)  We emphasize that the goal here is only to generate a reasonable 3D field for particulate biogenic silica concentrations.
The Si-cycling model considered here explicitly tracks two tracers, DSi and PSi. We thus denote the modelled column vectors for DSi and PSi concentrations by χ mod DSi and χ mod PSi . Biological uptake of silicate in the euphotic zone is essentially modelled after the simple nutrient-restoring scheme of the OCIMP-2 protocol (Najjar et al., 2007) with some slight modification. Specifically, the uptake rate J up (χ mod DSi ) vector is defined by where (z < z 0 ) ensures that uptake only happens in the euphotic zone, and χ mod DSi − α up χ obs DSi + /τ up is the modified nutrientrestoring scheme. Technically, (z < z 0 ) is an abuse of notation and here is meant to be equal to 1 in the euphotic zone and 0 otherwise (z is the column vector depths and z 0 = 80 m is the depth of the base of the euphotic zone). The column vector χ obs DSi represents World Ocean Atlas silicate observations (WOA18 Garcia et al., 2019) regridded to the OCIM v2.0 grid. Here, (x) + is a shortcut notation for x(x > 0), such that (χ mod DSi − α up χ obs DSi ) + /τ up only "activates" when χ mod DSi > α up χ obs DSi . The difference with the standard OCMIP-2 protocol lies in the addition of the α up modifier, which is a scalar that scales the field of observed silicate, allowing the optimized model to better fit observations. With α up close to 1, this parameterization essentially allows for the model to take up silicate in the euphotic zone when the simulated DSi concentration exceeds the observations. All the silicate that is taken up in this model is converted to sinking particulate PSi, which gravitationally settles with 735 optimizable velocity parameter w Si . Particulate biogenic silica is assumed to remineralize and redissolve into DSi in the water column with optimzable timescale τ rem . The rate of remineralization, J rem (χ mod PSi ), is thus simply defined by which essentially closes the Si cycle.
Hence, the steady-state tracer equation for dissolved silicate (DSi) is where J geo (χ mod DSi ) is an added term used to constrain the global inventory of Si in the system, which is set on geological timescales. In practice, we use where T grav is the vertical (downward) transport operator representing the flux divergence of particles. Particulate Si reaching the seafloor is not buried and instead remains in the deepest grid cell until eventual remineralization and redissolution into DSi.
A similar optimization procedure as for the Nd-cycle is applied to optimize the five parameters collected in Table A1. The 750 joint probability distribution of the mismatch between modelled and observed silicate is shown in Fig. A1. While not perfect overall, this simple model achieves a good fit to observations with a root-mean-square error of about 11.7 mmol m −3 . Figure B1 shows the prior distributions and optimized values of the parameters listed in Table 2.   parameters (φ0, φ∞, and z0) (g) Reversible scavenging reaction equilibirum constants, KX , and fractions returned to water when reaching seafloor, fX , for each particle type X. See also Table 2. This appendix describes the effect that enhanced Nd release with extreme ε Nd values has on effectively released ε Nd . Let X be a random variable with normal distribution N (µ ε , σ ε ) denote the observable in situ ε Nd value, and let Y denote the random variable of the effectively released ε Nd value. The mean expected value for Y is given by

(C4)
Author contributions. SKVH, BP, SLG, and SGJ designed the study. HL wrote preliminary MATLAB code. BP wrote the Julia model code, performed simulations, and analyzed data with input from all authors. SKVH and BP wrote the original manuscript draft, and all authors contributed to revision and editing of the manuscript prior to submission.