The Measurement Error Proxy System Model: MEPSM v0.2

Fischer, Matt J.

doi:https://doi.org/10.5194/gmd-17-6745-2024

Articles | Volume 17, issue 17

https://doi.org/10.5194/gmd-17-6745-2024

Articles | Volume 17, issue 17

Model description paper

12 Sep 2024

Model description paper |

| 12 Sep 2024

The Measurement Error Proxy System Model: MEPSM v0.2

Matt J. Fischer

Abstract

Proxy system models (PSMs) are an essential component of paleoclimate data assimilation and for testing climate field reconstruction methods. Generally, current statistical PSMs consider the noise in the output (proxy) variable only and ignore the noise in the input (environmental) variables. This problem is exacerbated when there are several input variables. Here we develop a new PSM, the Measurement Error Proxy System Model (MEPSM), which includes noise in all variables, including noise auto- and cross-correlation. The MEPSM is calibrated using a quasi-Bayesian solution, which leverages Gaussian conjugacy to produce a fast solution. Another advantage of MEPSM is that the prior can be used to stabilize the solution between an informative prior (e.g., with a non-zero mean) and the maximum likelihood solution. MEPSM is illustrated by calibrating a proxy model for δ¹⁸O_coral with multiple inputs (marine temperature and salinity), including noise in all variables. MEPSM is applicable to many different climate proxies and will improve our understanding of the effects of predictor noise on PSMs, data assimilation, and climate reconstruction.

Download & links

Article (PDF, 3624 KB)

Supplement (599 KB)

Download & links

Received: 03 Apr 2023 – Discussion started: 06 Apr 2023 – Revised: 06 Jul 2024 – Accepted: 15 Jul 2024 – Published: 12 Sep 2024

1 Introduction

Proxy system models (PSMs) describe how biological, geological, or chemical archives are imprinted with environmental signals (Evans et al., 2013; Dee, 2015). They are an essential component of paleoclimate data assimilation (Steiger et al., 2018; Tardif et al., 2019; Sanchez et al., 2021; King et al., 2023) and of pseudoproxy experiments, e.g., testing the fidelity of climate reconstruction methods (Loope et al., 2020 a, b). For PSMs, input variables include one or more environmental variables (e.g., temperature, water salinity, rainfall), and output variables include variables which can be read from natural archives, e.g., the abundance of trace elements or isotopes in carbonate archives. The processes between input and output may be linear or nonlinear fitted relationships (statistical PSMs) or more detailed physiochemical models (physiochemical PSMs). Statistical PSMs can be frequentist (i.e., the PSM is calibrated using ordinary least squares, OLS) or Bayesian (PSMs that make use of Bayes' rule). Current paleoclimate data assimilation projects often use frequentist PSMs (e.g., Steiger et al., 2018; Tardif et al., 2019) (see also example 1 in King et al., 2023), but some paleodata assimilation projects have begun to incorporate Bayesian PSMs (King et al., 2023). Current Bayesian PSMs (e.g., Tierney and Tingley, 2014, 2018; Tierney et al., 2019; Malevich et al., 2019) typically have the following form:

\begin{matrix} (1) & ζ \sim N (b_{0} + f (x, θ), σ_{ζ}^{2}), \end{matrix}

where ζ is sampled from the normal distribution 𝒩(μ,σ²), where b₀ is the regression intercept, x and ζ are the input and output variables respectively (note x is a row vector of predictor matrix X), and θ is a set of parameters. These Bayesian models are based on Bayesian OLS rather than Bayesian TLS (total least squares). In comparison, an errors-in-variables (EIV) model (containing error in both x and y) may have the following form:

\begin{matrix} (2) & y \sim N (b_{0} + f (x, B^{*}), d f (x, B^{*})^{'} Σ_{a a} d f (x, B^{*}) + ε_{y}^{2}), \end{matrix}

where b₀ is the regression intercept, $d f (x, B^{*})$ is the derivative of $f (x, B^{*})$ for each observation vector [y x], and the prime symbol (^′) is the transpose operator. Appendix A contains further information about Eq. (2). The covariance matrix Σ_aa is the covariance of the noise a associated with $x = x^{*} + a$ , where x^* denotes the true but unobservable part of x (so x^* is unobservable because it cannot be measured without noise). See Table 1 for notation information (including B^* explained under B). The term measurement error is used here generally to describe any uncertainty associated with the value of the predictors and the response variable, which could include both measurement uncertainty (e.g., from a thermometer) and methodological uncertainty (from the method used to “combine” several measurements, which may be spatially and temporally apart). The model in Eq. (2) is not identifiable, meaning the parameters cannot be uniquely determined without additional information, such as information about Σ_aa. A full Bayesian solution to the problem may also require reformulating the model in terms of a subspace of [Y X]; see, e.g., Florens et al. (1974). The focus of this paper is on a quasi-Bayesian solution to Eq. (2), which leverages Gaussian conjugacy instead of a Markov chain Monte Carlo (MCMC) solution.

In this paper, EIV and WLS (weighted least squares) refer to general approaches. York (1966, 1968) introduced a root-finding solution for one-predictor WLS regression. Ludwig and Titterington (1994) presented a maximum likelihood (ML) solution for a straight line in 3-dimensional space, with heterogenous noise in all variables. Schneider (2001) briefly discussed the multivariate method of total least squares, which assumes homogenous and identically distributed noise for both predictor and response fields. Hannart et al. (2014) introduced a ML algorithm for weighted TLS (WTLS) that accounts for heterogeneity and temporal autocorrelation in the predictor noise and the response noise but lacks noise cross-correlation. Also ML solutions, by their very nature, do not incorporate prior information about the regression coefficients. This study expands the model of Hannart et al. (2014), to include both prior information and generalized noise (auto- and cross-correlation within and between the noise series), in the Measurement Error Proxy System Model (MEPSM). The general EIV model is discussed in Sect. 2.1, MEPSM is formulated in Sect. 2.2, and steps for practical implementation are given in Sect. 2.3. MEPSM is applied to a real example in Sect. 3. Further, Sect. 3.3 investigates the differences that PSM calibration using MEPSM can make to paleodata assimilation.

Table 1Notation used.

^a The operator vec(A) means to column stack the matrix A (Henderson and Searle, 1979). ^b Superscript ⁽ⁱ⁾ refers to an index, rather than subscripting (e.g., $_{z_{i} z_{i}}$ ). ^c For the duplication matrix, see Magnus and Neudecker (1986).

Download Print Version | Download XLSX

Fuller (1990)Amiri-Simkooei and Jazaeri (2012)Hannart et al. (2014)Fang et al. (2017)

Table 2Solutions for the EIV model. Σ_A and Σ_B are given in my notation.

^a The matrix direct sum operator ⊕ is notation to build a block diagonal matrix, e.g., $Diag ([Σ_{a_{1} a_{1}}, \dots, Σ_{a_{n} a_{n}}])$ . ^b Note though that technically $\oplus_{i} Σ_{a_{i} a_{i}} = Σ_{A^{'}}$ , not Σ_A, so ⊕ is used abstractly in this case. ^c Full: Σ_A is not constrained to a particular type.

Download Print Version | Download XLSX

2 Methods

2.1 The EIV model

The general EIV model can be expressed as

\begin{array}{l} (3a) & z_{t} = z_{t}^{*} + q_{t} \\ (3b) & z_{t}^{*} = [y_{t}^{*} x_{t}^{*}] \\ (3c) & q_{t} = [v_{t} a_{t}] \\ (3d) & y_{t}^{*} = b_{0} + x_{t}^{*} B^{*} + ϵ_{t} . \end{array}

Note that, for X^* and B^*, the superscript star also has a second use: it denotes that these entities do not include an intercept term, e.g., $B = {[b_{0} {B^{*}}^{'}]}^{'}$ . From Eq. (3), the relevant covariance matrices are

\begin{matrix} (4) & Σ_{z^{*} z^{*}} = [\begin{array}{cc} σ_{Y^{*} Y^{*}}^{2} & Σ_{Y^{*} X^{*}} \\ Σ_{X^{*} Y^{*}} & Σ_{X^{*} X^{*}} \end{array}], Σ_{Q} = [\begin{array}{cc} Ω_{V} & 0 \\ 0 & Σ_{A} \end{array}], \end{matrix}

and

\begin{matrix} (5) & Σ_{A} = [\begin{array}{cccc} Ω_{A_{1}} & Ω_{A_{1} A_{2}} & \dots & Ω_{A_{1} A_{p}} \\ Ω_{A_{2} A_{1}} & Ω_{A_{2}} & \dots & Ω_{A_{2} A_{p}} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ Ω_{A_{p} A_{1}} & Ω_{A_{p} A_{2}} & \dots & Ω_{A_{p}} \end{array}] . \end{matrix}

See Appendix B for further information on covariance notation. These matrices describe a statistical stationary model. For example, the covariance matrix Σ_A fully specifies the lag autocovariance and lag cross-covariances of the noise variables in A, but (Σ_A)_t=Σ_A. Note this does not mean within Σ_A that $Σ_{a_{i} a_{i}} = Σ_{a_{j} a_{j}}$ or $Ω_{A_{i}} = Ω_{A_{j}}$ (those are specific cases of Σ_A). Also note that in this paper Ω_V=Σ_V. More will be said about Σ_Q in Sect. 3.1.

Table 2 summarizes the main differences between previous solutions of the EIV model. First, these methods all assume that the equation noise ϵ_t (Eq. 3d) is either zero or not separate from v_t. Earlier work by Fuller (1987) shows a way of including the equation noise separate from the response noise, but the predictor noise is limited to the p×p matrices $Σ_{a_{t} a_{t}}$ . The solutions in Table 2 relate to generalized least squares (GLS), weighted total least squares (WTLS), maximum likelihood (ML), and maximum a posteriori (MAP) methods. A general estimator is of the following form:

\begin{array}{l} (6a) & Σ_{Z Z} & = (Z - μ_{Z})^{'} J_{n}^{'} {[\begin{array}{cc} Σ_{Y} & Σ_{Y X} \\ Σ_{X Y} & Σ_{X} \end{array}]}^{- 1} J_{n} (Z - μ_{Z}) \\ (6b) & Σ_{Z^{*} Z^{*}} & = Σ_{Z Z} - Σ_{Q Q} = [\begin{array}{cc} σ_{Y^{*} Y^{*}}^{2} & Σ_{Y^{*} X^{*}} \\ Σ_{X^{*} Y^{*}} & Σ_{X^{*} X^{*}} \end{array}] \\ (6c) & B^{*} & = Σ_{X^{*} X^{*}}^{- 1} Σ_{X^{*} Y^{*}} \\ (6d) & b_{0} & = μ_{Z} \cdot {[\begin{array}{cc} 1 & - {B^{*}}^{'} \end{array}]}^{'}, \end{array}

where μ_Z is the GLS mean, $Σ_{Y} = Σ_{Y^{*}} + Ω_{V}$ , and $Σ_{X} = Σ_{X^{*}} + Σ_{A}$ . The covariance matrix $Σ_{X^{*} X^{*}}$ is a weighted matrix, e.g., $X^{*}^{'} Σ^{- 1} X^{*}$ , but different ways of solving for B allow for different weightings. The assumption is then made that $Σ_{B^{*}} = Σ_{X^{*} X^{*}}^{- 1}$ (e.g., Amiri-Simkooei and Jazaeri, 2012; Fang et al., 2017), despite the fact that different possible weightings exist. The ML method can be used to obtain a theoretically better estimation of Σ_B, but it is computationally expensive if Σ_A is not constrained. For example, in Hannart et al. (2014) Σ_A is a block diagonal matrix: $Σ_{A} = \oplus_{i = 1}^{p} Ω_{A_{i}}$ , where ⊕ is the direct sum operator for matrices (see Table 2 footnotes). The aim of the present study is to provide a solution for the EIV model that includes a fast theory-derived approximation of Σ_B, a prior for Σ_B, and a full Σ_A, including a way to estimate Σ_A with limited initial information.

2.2 MEPSM

As in previous studies, I also ignore the equation error, so the model in Eq. (3) can be re-expressed:

\begin{matrix} (7) & [\begin{array}{c} Z_{11} \\ Z_{21} \\ ⋮ \\ Z_{n 1} \end{array}] = [\begin{array}{cc} 1 & x_{1} \\ 1 & x_{2} \\ ⋮ & ⋮ \\ 1 & x_{n} \end{array}] [\begin{array}{c} b_{0} \\ B^{*} \end{array}] + [\begin{array}{cc} I_{n} & - ({B^{*}}^{'} \otimes I_{n}) \end{array}] [\begin{array}{c} V \\ vec (A) \end{array}], \end{matrix}

where $V \sim N (0_{n}, Ω_{V})$ , and $vec (A) \sim N (0_{n \cdot p}, Σ_{A})$ (see Table 1 footnotes for the operator vec). The response variable is the first column of Z, denoted as Z₁ or more generally as Y. Note also the two possible definitions of response noise for the model: $V = Y - b_{0} - X^{*} B^{*}$ and $e = Y - X B$ (where X includes an intercept term, to match with B). The difference is that

\begin{matrix} (8) & V = Y - X B + A B^{*} = e + A B^{*} . \end{matrix}

From $e = V - A B^{*}$ , the covariance matrix of model prediction can be expressed as

\begin{matrix} (9) & W = Ω_{V} + ({B^{*}}^{'} \otimes I_{n}) Σ_{A} (B^{*} \otimes I_{n}) . \end{matrix}

Bayes' rule states that the posterior density function p(θ|𝒟) is proportional to the product of the likelihood function L(θ) and the prior density function p(θ):

\begin{matrix} (10) & p (θ | D) \propto L (θ) \times p (θ) . \end{matrix}

Note that it does not matter if the likelihood function is a scaled distribution, e.g., a normal density function multiplied by a scalar (so it integrates to a number other than unity).

Assuming that the right-side distributions are Gaussian (or scaled Gaussian), then

\begin{matrix} (11) & N (B, Σ_{B}) \propto N (B_{ml}, Σ_{B}^{ml}) \times N (B_{0}, Σ_{0}), \end{matrix}

where ml refers to the maximum likelihood solution. The posterior 𝒩(B, Σ_B) integrates to unity, as expected. First I show an example by using the GLS likelihood. Let the GLS likelihood be

\begin{matrix} (12) & N (B_{ml}, Σ_{B}^{ml}) = N ((X^{'} Σ^{- 1} X)^{- 1} X^{'} Σ^{- 1} Y, (X^{'} Σ^{- 1} X)^{- 1}) . \end{matrix}

Now apply the Gaussian multiplication identity:

\begin{matrix} (13) & \begin{aligned} N & (C (Σ_{B}^{ml - 1} B_{ml} + Σ_{0}^{- 1} B_{0}), C) \propto N (B_{ml}, Σ_{B}^{ml}) \\ \times N (B_{0}, Σ_{0}), \end{aligned} \end{matrix}

where $C = {(Σ_{B}^{ml - 1} + Σ_{0}^{- 1})}^{- 1}$ . Thus the mean and variance of p(θ|𝒟) are

\begin{array}{l} (14a) & B & = (X^{'} Σ^{- 1} X + Σ_{0}^{- 1})^{- 1} (X^{'} Σ^{- 1} Y + Σ_{0}^{- 1} B_{0}) \\ (14b) & Σ_{B} & = (X^{'} Σ^{- 1} X + Σ_{0}^{- 1})^{- 1} . \end{array}

The above solution is a simple example using the GLS likelihood. To obtain the solution for the EIV problem, the GLS likelihood is replaced with the WTLS likelihood $N (B_{wtls}, Σ_{B}^{wtls})$ , where B_wtls and $Σ_{B}^{wtls}$ are derived in Appendix C and D. Using the WTLS likelihood, simplifying the product $Σ_{B}^{ml - 1} B_{ml}$ (from Eq. 13) is no longer feasible, so instead we write the mean and variance of p(θ|𝒟) simply as

\begin{array}{l} (15) & B & = {({Σ_{B}^{wtls}}^{- 1} + Σ_{0}^{- 1})}^{- 1} ({Σ_{B}^{wtls}}^{- 1} B_{wtls} + Σ_{0}^{- 1} B_{0}) \\ (16) & Σ_{B} & = {({Σ_{B}^{wtls}}^{- 1} + Σ_{0}^{- 1})}^{- 1} . \end{array}

2.3 Implementation

An application to calibrate a real PSM is provided in Sect. 3. Here general aspects of the solution and implementation are discussed.

The predictor noise and response noise, A and V, can be estimated as

\begin{array}{l} (17) & A & = - (B^{*^{'}} \otimes I_{n}) Σ_{A} (I_{p} \otimes W^{- 1} e) \\ (18) & V & = Ω_{V} W^{- 1} e . \end{array}

Matrix A is needed for the calculation of both B_wtls and $Σ_{B}^{wtls}$ , while V is needed to re-estimate Ω_V if relevant (see next paragraph).

The n×n covariance matrices $Ω_{A_{i}}$ (or $Ω_{A_{i} A_{j}}$ ) and Ω_V ideally need information about the cross-correlation within A and autocorrelation of A and V. If the ACF (autocorrelation function) and CCF (cross-correlation function) are not available, these can potentially be estimated using an iterative procedure. The procedure begins with vectors $σ_{A}^{2}$ and $σ_{V}^{2}$ , which are the heteroscedastic error variances (the diagonals of Σ_A and Ω_V).

Initialize $Σ_{A} = Diag (σ_{A} ⊙ σ_{A})$ and $Ω_{V} = Diag (σ_{V} ⊙ σ_{V})$ , where ⊙ denotes the Hadamard product.
Estimate the first posterior, 𝒩(B, Σ_B), and the predictor and response noise, A and V (Eqs. 15–18).
Calculate the ACF (for A and V) and the CCF (for A).
Estimate Σ_A and Ω_V as in Appendix E, and recalculate the posterior distribution.

3 Application

3.1 A proxy system model, uncertainties, and the prior

A bivariate (i.e., two-predictor) PSM for coral δ¹⁸O is

\begin{matrix} (19) & δ^{18} O_{coral} = b_{0} + b_{1} SST + b_{2} SSS, \end{matrix}

where SST is sea surface temperature and SSS is sea surface salinity. Note that in the coral literature, the coefficients b₁ and b₂ are commonly referred to as a₁ and a₂, but here I use the former symbols for generality. This bivariate PSM has been used in data assimilation (Sanchez et al., 2021) and Monte Carlo experiments (Thompson et al., 2022). The bivariate PSM can be extended by rewriting it in the form of Eq. (7). The MEPSM requires estimates of the time-varying uncertainty of all variables and a prior distribution for B (see below). Here the time-varying uncertainties for the predictors are obtained from recent SST and SSS products that provide the complete uncertainty field $ε^{2} (long, lat, t)$ , i.e., the uncertainty variance for each grid cell, at each time point.

For ERSSTv5, for each grid cell and time point, the SST total uncertainty $ε_{T}^{2} (long, lat, t)$ is the sum of the parametric uncertainty and reconstruction uncertainty. These uncertainties are derived from a large reconstruction ensemble (Huang et al., 2020). The parametric uncertainty is based on the difference between the ensemble average and each ensemble member, given various internal parameters which affect SST uncertainty (including the uncertainty of the ship, buoy, or float measurement). The reconstruction uncertainty is based on the difference between each ensemble member and a pseudo-observation dataset on which the reconstruction method is trained, e.g., OISST (optimally interpolated SST). Note that the calculation of reconstruction uncertainty includes grid cells that have observational or no observational data, while the calculation of parametric uncertainty includes only grid cells that have observational data.

For HadEN4, the SSS total uncertainty $ε_{S}^{2} (long, lat, t)$ is a combination of background uncertainty and observational uncertainty (Good et al., 2013). The observational uncertainty is obtained from previous studies, while the background uncertainty is the uncertainty associated with a persistent-based forecast.

For δ¹⁸O_coral, the main known uncertainty is the analytical uncertainty, which is ∼0.05 ‰ for δ¹⁸O_coral (Osborne et al., 2013). Time variation could be included by expressing the analytical uncertainty as a relative standard deviation of the mean δ¹⁸O (the effect of this will be tested elsewhere). Other uncertainties, which are unknown at many sites, include the intra- and inter-colony noise (Sayani et al., 2019). Note also that detrending adds time-varying uncertainty, because the trend uncertainty is larger at the ends of the time series (this is applicable to all detrended variables).

For MEPSM, the matrix Σ_A can be initially constructed using Diag(σ_A⊙σ_A), where $σ_{A}^{'} = [ε_{T} (site, \cdot), ε_{S} (site, \cdot)]$ , “site” refers to the grid cell of a particular site, and the $(site, \cdot)$ notation means over all time points. In the SST and SSS products, no information is given on the noise serial correlation, and no information exists on the noise cross-correlation between SST and SSS products. Seasonal peaks in the autocorrelation function (ACF) of SST noise could arise from, for example, seasonal changes in shipping tracks, which would affect the parametric uncertainty. Cross-correlation between SST and SSS noise might arise from sampling these variables at the same points within a particular grid cell (i.e., from the same ship) or due to seasonal variation in shipping tracks. Future work should investigate the ACF and CCF (cross-correlation function) for the SST and SSS ensembles (this is beyond the scope of the current paper). Finally, since the uncertainties in the predictors and response are fundamentally different, then Σ_Q is assumed to be a block diagonal matrix, with Ω_V and Σ_A on the block diagonal and zero-filled blocks on the block anti-diagonal (Eq. 4).

For Bayesian methods, informative prior distributions are thought to be more useful than noninformative priors (Lemoine, 2019). For example, when using a fully noninformative prior distribution, the posterior distribution will be close to the maximum likelihood distribution, which would make using the fully noninformative prior somewhat obsolete.

The prior distribution for B is obtained from previous values that were used in Monte Carlo PSM experiments (Thompson et al., 2022; Watanbe and Pfeiffer, 2022). Those previous studies did not seek to calculate Bayesian posterior distributions, but the values in those studies represent current beliefs about the coefficients and their uncertainty and can be used as a prior. The prior distribution obtained from those studies is

\begin{matrix} (20) & \begin{aligned} p (B) & = N (B_{0}, Σ_{0}) \\ = N (\begin{array}{c} [\begin{array}{c} b_{0} \\ - 0.22 \\ 0.27 s \end{array}], [\begin{array}{ccc} σ_{b_{0}}^{2} & 0 & 0 \\ 0 & {0.02}^{2} & 0 \\ 0 & 0 & {0.15}^{2} \end{array}] \end{array}), \end{aligned} \end{matrix}

where s=0.97 is a scaling factor between δ¹⁸O_seawater (in VSMOW) and δ¹⁸O_coral (in VPDB). The values of $b_{1} = - 0.22$ and $σ_{b_{1}}^{2} = {0.02}^{2}$ are from Thompson et al. (2022) and Watanbe and Pfeiffer (2022) respectively. The value of $s^{- 1} b_{2} = 0.27$ ‰ psu⁻¹ originates from LeGrande and Schmidt (2006) and is an average ‰ per salinity value for the tropical Pacific Ocean. Regional variations in this value will probably be revealed by new datasets (DeLong et al., 2022), but for the purpose of this example of MEPSM, the value of 0.27 ‰ psu⁻¹ is used. Thompson et al. (2022) considered two values for $σ_{b_{2}}^{2}$ (i.e., $σ_{b_{2}}^{2} = {{0.1}^{2}, {0.2}^{2}}$ ), so for this example an ”average” value is used (0.15²). The value of the intercept b₀ and its uncertainty $σ_{b_{0}}^{2}$ are unknown in the sense that they are site-dependent values. A prior for these unknown values can be estimated by using the other prior coefficients and SST and SSS data for a particular site. For a particular site, the prior value of b₀ can be calculated as

\begin{matrix} (21) & b_{0} = μ_{y} - μ_{X} [\begin{array}{c} - 0.22 \\ 0.27 s \end{array}], \end{matrix}

where $μ_{X} = [μ_{X_{1}} μ_{X_{2}}]$ and μ_y are the mean values of the predictors and response. From ordinary least squares theory, the diagonal elements of the covariance of B are typically given as

\begin{matrix} (22) & σ_{b_{j}}^{2} = σ_{e}^{2} / {RSS}_{j}, \end{matrix}

where $σ_{e}^{2}$ is the error variance, and RSS_j is the residual sum of squares after regressing X_j on X_−j, where X_−j is the matrix X omitting column j. So for $σ_{b_{0}}^{2}$ ,

\begin{matrix} (23) & {RSS}_{0} = n - n^{2} μ_{X}^{'} (X_{- 0}^{'} X_{- 0})^{- 1} μ_{X}, \end{matrix}

where X₋₀ is the observed part of X not including an intercept term (different than X^*, which is the unobserved part of X). Preliminary analysis showed that the prior value for $σ_{b_{0}}^{2}$ had a large influence on the posterior distribution of B, so $σ_{b_{0}}^{2}$ only was scaled to be uninformative, i.e., $10^{3} σ_{b_{0}}^{2}$ .

3.2 Coral example

Rock Islands (7.27° N, 134.38° E) is a site on the western barrier reef of Palau, western Pacific Ocean. A number of coral cores were sampled from Rock Islands (RI) and the surrounding area, but the focus here is on one core labeled RI6, which is a monthly resolved record spanning 1899–2008. The core was sampled from a narrow-entrance lagoon, in shallow water (∼2 m depth). Osborne et al. (2013) compared the coral δ¹⁸O record with temperature data from the NCEP/NCAR Reanalysis (Kalnay et al., 1996) and salinity data from LEGOS (Delcroix et al., 2011). Using the raw (not detrended or deseasonalized) RI record from 1970–2008 and ordinary least squares, Osborne et al. (2013) regressed the coral δ¹⁸O against surface temperature and salinity, producing the following equation:

\begin{matrix} (24) & δ^{18} O_{coral} = - 0.06 SST + 0.43 SSS - 18.70; \end{matrix}

see Table A4 in Osborne et al. (2013). Osborne et al. (2013) attributed the small SST coefficient to the small seasonal SST range at Palau (∼1.5 °C), presumably because the small SST range makes the data scatter with δ¹⁸O more spherical than ellipsoid. They did not directly compare their raw SSS coefficient with any other data. Hence the RI δ¹⁸O record was chosen for this example, in order to further investigate the SST and SSS effects on coral δ¹⁸O at this site. In the following analysis the main differences with the analysis of Osborne et al. (2013) are that here the SST and SSS data are extracted from ERSSTv5 and HadEN4, because these products provide the uncertainty fields (Sect. 3.1). Secondly, the time period examined is from 1950–2008. Thirdly, for this study, all variables were detrended (using linear detrending and without removing the intercept), in order to ensure stationarity in the mean. For all variables, the trends at this site were weak. For example, for δ¹⁸O_coral the trend from 1950–2008 was $- 0.00135 \pm 0.004$ ‰ per year. A Jupyter Notebook with these preprocessing steps is available in the Supplement. Using this updated (and detrended) data, the OLS regression was recalculated as

\begin{matrix} (25) & δ^{18} O_{coral} = - 0.13 SST + 0.42 SSS - 16.25 . \end{matrix}

Next MEPSM was applied to the same data, following the steps in Sect. 2.3. Except, for the application presented here, only Σ_A was updated in step 4. Ω_V was set as $Ω_{V} = Diag (σ_{V} ⊙ σ_{V})$ throughout, because here σ_V is thought to be mostly analytical noise, which should be approximately white (the uncertainty due to detrending is time-varying as well as white). For Σ_A, for the auto- and cross-correlation functions of the predictor noise, up to 100 lags (months) were retained to construct the submatrices $R_{A_{i}}$ and $R_{A_{i} A_{j}}$ (see Appendix E). Experiments with 50 and 150 lags showed little difference in the final posterior. In any application of MEPSM, the user should consider the source of the predictor and response noise in constructing Σ_A and Ω_V (as in Sect. 3.1).

Figure 1 shows the prior and the first and final posterior distributions for the SST and SSS coefficients. The first posterior distribution (dashed line) is obtained after step 2 (Sect. 2.3), while the final posterior is obtained after updating Σ_A for auto- and cross-correlation. The (final) mean SST and SSS coefficients are given in Fig. 2: $b_{1} = - 0.25$ ‰ °C⁻¹ and b₂=0.49 ‰ psu⁻¹. Figure 2 shows the marginal plots of the 3-dimensional scatter plot (SST, SSS, δ¹⁸O), together with the prior and posterior regression lines. For Rock Island, the SST and SSS coefficients are both steeper for the final posterior, compared to the OLS equation (Eq. 25).

https://gmd.copernicus.org/articles/17/6745/2024/gmd-17-6745-2024-f01

Figure 1The prior and posterior distributions for the SST (b₁) and SSS (b₂) coefficients. The final posterior is wider than the first posterior, because the final posterior accounts for auto- and cross-correlation in Σ_A. In Figs. 1–3 the first posterior is the dashed line.

Download

https://gmd.copernicus.org/articles/17/6745/2024/gmd-17-6745-2024-f02

Figure 2The regression lines of the prior and posterior distributions. In 3 dimensions, these regression lines are the sides of a plane. Each shaded point corresponds to a monthly value from 1950–2008.

Download

https://gmd.copernicus.org/articles/17/6745/2024/gmd-17-6745-2024-f03

Figure 3The effect of the regularization parameter λ on the SST (b₁, top) and SSS (b₂, bottom) coefficients. The left and right side of the plots correspond to the prior distribution p(B) and likelihood distribution L(B) respectively.

Download

Another advantage of Bayesian analysis is that attaching a ridge parameter λ to the prior variances means that the method can be used to regularize the solution, in a way that is different than traditional methods of regularization, such as principal component analysis, where components are retained by hard thresholding. Here, the ridge parameter λ is attached to the prior and allowed to vary over the range 10⁻⁴ to 10²:

\begin{matrix} (26) & p (B) = N (\begin{array}{c} [\begin{array}{c} b_{0} \\ - 0.22 \\ 0.27 s \end{array}], [\begin{array}{ccc} 10^{3} σ_{b_{0}}^{2} & 0 & 0 \\ 0 & λ {0.02}^{2} & 0 \\ 0 & 0 & λ {0.15}^{2} \end{array}] \end{array}) . \end{matrix}

Figure 3 shows the effect of λ for both the first and final posterior. As λ increases, B moves from the prior distribution to the likelihood distribution. The difference between the solid and dashed lines (for each coefficient) is because of the effect of adjusting Σ_A for auto- and cross-correlation, which becomes less relevant for small λ ( $λ < 10^{- 2}$ ). More will be said about serial dependence in A below. This type of prior regularization will likely be useful for sites where the (p+1)-dimensional scatter of data (predictors plus response) is more spherical than ellipsoid. For example, a more spherical scatter of data could occur at sites where the SST range is small. In practice, the value of λ could be estimated using cross-validation methods.

Table 3Regression parameters and their standard deviation, from four different solutions.

Download Print Version | Download XLSX

Table 3 compares the regression coefficients (and variance) from four different solutions: a ML solution (Appendix F), the WTLS method (Appendix C and D), and the MEPSM solution for the first and final posterior. The maximum likelihood estimate (MLE) and WTLS solution are basic implementations, i.e., no prior and no adjustment for autocorrelation or cross-correlation. The MLE and WTLS solutions are similar, with the variance being generally more conservative for WTLS, owing either to assumptions in the two methods or differences in method implementation, e.g., optimization for MLE versus analytical solution (with Gaussian assumptions) for WTLS. For the first posterior of MEPSM, both the mean and the variance move away from the WTLS solution and towards a tight prior: in Fig. 3 the MEPSM first posterior (on the dashed line) is the same as the λ=10⁰ solution, and the basic WTLS solution is the same as λ≥10². For the final posterior (Table 3), the SST and SSS coefficients change steepness: b₁ becomes steeper, while b₂ becomes a little less steep compared to the WTLS solution and the first posterior (also seen in Fig. 3 as the difference between the solid and dashed lines). For both coefficients (Table 3), the variance increases for the final posterior (relative to the first posterior), because the final posterior adjusts for the auto- and cross-correlation in Σ_A.

Figure 4 shows the autocorrelation of the noise in each variable (A₁ is the SST noise, and A₂ is the SSS noise), for the first and final posterior distribution. For the first posterior solution, the noise covariance in Σ_A and Ω_V is assumed to be heteroscedastic but white. This assumption is clearly not true, because all variables appear to have seasonally varying noise as shown by their autocorrelation functions (Fig. 4, left column). However, when this seasonally varying noise is included in Σ_A (but not in Ω_V), the response noise V essentially becomes white, as all the seasonal dependence moves into the predictor noise (Fig. 4, right column). Whether or not the response noise V should have seasonal variation (see Sect. 3.1) requires expert knowledge. This example is simply to illustrate the effect of including serially correlated noise in a multipredictor model. The implications of seasonally varying noise for coral PSMs will be addressed in another paper, with the application of MEPSM to a large coral database (Walter et al., 2023).

https://gmd.copernicus.org/articles/17/6745/2024/gmd-17-6745-2024-f04

Figure 4The autocorrelation function of the response noise (V), the SST noise A₁, and the SSS noise A₂, for the first posterior (left column) and the final posterior (right column). The predictor and response noise were calculated using Eqs. (17)–(18).

Download

Table 4Notation for Sect. 3.3 (unless otherwise stated in text).

^∗ In the experiments, p=5 and m=10 because there are two state variables and five grid points.

Download Print Version | Download XLSX

3.3 Data assimilation experiments

Two data assimilation (DA) experiments were performed using the Palau coral example. The purpose of these experiments is to investigate the effect of different H on data assimilation (Appendix G introduces DA and describes potential biases caused by misspecified H). For the experiments, the model prior covariance Σ_εε was obtained from iCESM1 (Brady et al., 2019), using a preindustrial run, and data from the last 400 years (1450–1850 CE, which is “closer” to the coral example period 1950–2007). Model SST and SSS data were extracted from a line of five grid cells, centered on the RI6 location. Although there is only one observational coral record, for these experiments, the same observational record is repeated for the five grid cells (Y^o in DA). All data were deseasonalized but kept at monthly resolution. A line of grid cells was used, so the covariance Σ_εε includes some spatial correlation. The reasons for using covariance matrices in these experiments will become apparent. The covariances Σ_εε and Σ_rr were the same for both experiments and were constructed as follows:

$Σ_{ε ε} = Var (X - {\tilde{U}}_{k} D_{k} V_{k}^{'})$ , where $X = {UDV}^{'}$ is the singular value decomposition of X, k=1:3 (means the three leading eigencomponents), and $\tilde{U}$ denotes components filtered with a bandpass filter (2–20 years). Here interannual variability (as in U_k) is considered to be the state of interest. Filtering U better separates signal and noise.
$Σ_{r r} = FH Σ_{ε ε} H^{'} F$ , where $F = Diag ([\frac{σ_{V}}{σ_{h_{1}}}, \dots, \frac{σ_{V}}{σ_{h_{p}}}])$ and $Diag (H Σ_{ε ε} H^{'}) = [σ_{h_{1}}^{2}, \dots, σ_{h_{p}}^{2}]$ . This means that Σ_rr is a covariance matrix with the same correlation structure as HΣ_εεH^′ but with variance scaling provided by $σ_{V}^{2} = Var (V)$ , where V is from Eq. (18) (for DA, Var(V) is just the total variance of V). Also here $H = B_{post2}^{*^{'}} \otimes I_{p}$ , where B_post2 denotes the values of B for the final posterior in Table 3. In the experiments below, keeping Σ_rr constant simplifies the comparisons.

To understand the experiment results, it is useful to cast data assimilation into observation space:

\begin{matrix} (27) & \begin{aligned} y^{a} & = H x^{a} = HK y^{o} + (I_{p} - HK) H x^{b} \\ = \frac{(H Σ_{ε ε} H^{'}) y^{o} + Σ_{r r} y^{b}}{(H Σ_{ε ε} H^{'} + Σ_{r r})}, \end{aligned} \end{matrix}

where $K = Σ_{ε ε} H^{'} (H Σ_{ε ε} H^{'} + Σ_{r r})^{- 1}$ , and x^b, y^o, and y^b are treated like column vectors (to simplify the syntax). As expected, the analysis y^a is a weighting between the prior $N (y^{b}, H Σ_{ε ε} H^{'})$ and some likelihood 𝒩(y^o, Σ_rr).

In both experiments that follow, the value of H in Eq. (27) changes (but not the value of Σ_rr, as defined above). Three different $H = {B^{*}}^{'} \otimes I_{p}$ are tested, which are constructed from three values of B^*: $B_{ols}^{*} = [- 0.13, 0.42]$ , $B_{post1}^{*} = [- 0.189, 0.635]$ , and $B_{post2}^{*} = [- 0.248, 0.492]$ (see Eq. 25 and Table 3). The difference between the two experiments is the value of y^b=Hx^b.

The data assimilation was implemented using a reduced-rank Kalman filter (Elbern et al., 2014) (rank reduction of Σ_εε is only used in Experiment 2). A notebook and code module for the experiments is available (Fischer, 2024). Note in the following subsections that V only refers to eigenvectors (see Table 4) and not the noise from MEPSM.

3.3.1 Experiment 1

In Experiment 1, x^b (and thus y^b) is set to zero. This mimics an offline DA situation where the model prior might be centered (i.e., zero mean) and constant. In addition, a scaling parameter λ is added to the prior, i.e., $N (y^{b}, H (λ Σ_{ε ε}) H^{'})$ . If the prior covariance is large, then the prior is noninformative, and y^a≈y^o. Here a tighter prior is of interest, so λ is set to λ=0.1. Henceforth Σ_εε refers to a scaled prior covariance matrix, but the λ term is dropped.

https://gmd.copernicus.org/articles/17/6745/2024/gmd-17-6745-2024-f05

Figure 5For Experiment 1 ( $λ = 0.1, y^{b} = 0_{p}$ ), (a) the time series Y^a (solid lines, color key as in b) and (b) the eigenvalue shrinkage of the projection matrix (see text), for three different values of H. These experiments used five model grid points, but here Y^a is plotted for the central grid point. In (a) the dashed line is Y^o.

Download

The DA posterior time series Y^a is shown for the three different values of H in Fig. 5a, and the eigenvalue shrinkage (explained below) is shown in Fig. 5b. For λ=0.1, the Y^a resulting from H_ols shows the most shrinkage towards the zero-mean prior (Fig. 5a).

To understand the results from theory, two pieces of information are needed (see Table 4 for notation):

Λ(ΣC)=Λ(Σ^0.5CΣ^0.5), where Σ and C are real symmetric square matrices.
HΣ_εεH^′ and Σ_rr have similar eigenvectors (because here Diag(F) is near constant) but different eigenvalues.

Setting y^b to zero removes the Σ_rry^b term in Eq. (27), leaving $y^{a} = \frac{(H Σ_{ε ε} H^{'})}{(H Σ_{ε ε} H^{'} + Σ_{r r})} y^{o}$ . The eigenvalues of the p×p projection matrix $\frac{(H Σ_{ε ε} H^{'})}{(H Σ_{ε ε} H^{'} + Σ_{r r})}$ are of interest here. It follows (from the points above) that the projection matrix has the same eigenvalues as

\begin{matrix} (28) & \begin{aligned} (H Σ_{ε ε} H^{'})^{0.5} (H Σ_{ε ε} H^{'} + Σ_{r r})^{- 1} (H Σ_{ε ε} H^{'})^{0.5} \\ = {VDV}^{'} ({VD}^{2} V^{'} + {VS}^{2} V^{'})^{- 1} {VDV}^{'} \\ = VD (D^{2} + S^{2})^{- 1} {DV}^{'}, \end{aligned} \end{matrix}

where D² and S² are the eigenvalue matrices of HΣ_εεH^′ and Σ_rr respectively. So in this example, the shrinkage factors of the projection matrix can be expressed as $d_{j}^{2} / (d_{j}^{2} + s_{j}^{2})$ . Figure 5b shows that using H_ols shrinks the eigenvalues more than for H from MEPSM. In this introductory experiment, the results have been examined in observation space; the next experiment will consider both x^a and y^a.

3.3.2 Experiment 2

In Experiment 2, model SST and SSS data were obtained for 1950–2005 from the iCESM historical run (which ended in 2005), for x^b. The model data were deseasonalized and filtered using a low-pass (>2-year) filter (so here x^b is a smooth prior, not a zero vector). This setup mimics data assimilation with a transient prior, e.g., offline data assimilation with time windows (windows are a type of prior smoothing). Note, for this experiment, that the model trajectory does not need to strongly match the observed historic trajectory. Further, Σ_εε is not a full rank matrix (the first three eigenvalues span 98% of the total variance), so for Experiment 2 only the first three leading components of Σ_εε are used (which helps to regularize X^a). Figure 6 shows, for λ=0.1, the three time series Y^o, Y^b, and Y^a for each different H (named $H_{ols}, H_{post1}, H_{post2}$ , where the latter two refer to the first and final posterior for MEPSM). Figure 6 shows the advantage of H from MEPSM is that, for example, y^b=H_post1x^b has a greater range and thus allows y^a to better capture the peaks and troughs in y^o than can y^b=H_olsx_b. To further compare the results for the different H, (X^a, Y^a) was calculated for $λ = [10^{- 2}, \dots, 10^{2}]$ and compared with $(X^{*}, Y^{*})$ using the mean absolute error metric:

\begin{matrix} (29) & mae (Y^{a}, Y^{*}) = n^{- 1} \sum_{i = 1}^{n} | y_{i}^{a} - y_{i}^{*} | . \end{matrix}

This was done similarly for $mae (X^{a}, X^{*})$ . Note that $(X^{*}, Y^{*})$ , the estimated state from MEPSM, has no direct part in the DA.

Figure 7 shows that for this site, for H_post2 (and H_post1) the analysis $X_{sst}^{a}$ and Y^a typically have a smaller mae compared to H_ols. Also, mae decreases as λ increases. For $mae (Y^{a}, Y^{*})$ , the decrease flattens when y^a≈y^o (for λ>1, Fig. 7b), but note that Y^o is not Y^*. So in this example, it seems that a solution (X^a) better than the likelihood (y^o=Hx^a) cannot be achieved, which means that H from MEPSM wins (lower mae for $X_{sst}^{a}$ ), because H_ols biases $X_{sst}^{a}$ (Fig. 7a). For H_ols, there is less bias for $X_{sss}^{a}$ , perhaps owing to the relative difference between OLS and MEPSM with respect to b₂ (the SSS coefficient). Performing similar data assimilation experiments on global climate proxy datasets will better inform such issues and regional variation.

https://gmd.copernicus.org/articles/17/6745/2024/gmd-17-6745-2024-f06

Figure 6For Experiment 2 ( $λ = 0.1, y^{b} = H x^{b}$ ), the time series of Y^o, Y^b, and Y^a (color key right) for three different H (on left axis). As in Fig. 5 Y^a and Y^b refer to the central grid point. In the panels, Y^o is the same, but Y^b and Y^a depend on H. Note how y^a is always between y^b and y^o.

Download

https://gmd.copernicus.org/articles/17/6745/2024/gmd-17-6745-2024-f07

Figure 7For Experiment 2 (y^b=Hx^b), the mean absolute error for different λ for (a) X^a and (b) Y^a (using the central grid point series) and for the three different H (color key as in b). Panel (a) shows the mae for $X_{sst}^{a}$ (solid lines) and $X_{sss}^{a}$ (long dashed lines), while panel (b) is the mae for Y^a (see Eq. 29). Small values of λ mean y^a≈y^b, while large λ values mean y^a≈y^o.

Download

4 Conclusions

MEPSM is a new type of proxy system model which incorporates both prior information and generalized noise in all variables, such as cross-correlated noise within the predictor variables. DA theory and two experiments suggest that, in data assimilation, the MEPSM should have an advantage over conventional OLS-calibrated PSMs (the latter are widely employed in current paleo-DA projects). In the work presented here for MEPSM, the response noise is assumed to be independent of the predictor noise (the zero matrices in Σ_Q), but cross-correlation between the predictor and response noise is possible (if needed), by complete quadratic multiplication of Σ_Q, rather than treating Ω_V and Σ_A separately, in Eq. (9). However, for many PSMs the predictor and response noise are fundamentally different and therefore should be independent. The next step is to apply MEPSM to calibrate multiple-input proxy system models for different climate proxies (e.g., isotopes in corals, tree rings, and speleothems) and to incorporate MEPSM into current data assimilation projects.

Appendix A: Basis of Eq. (2)

Let

\begin{matrix} (A1) & y = f (x_{1} - a_{1}, \dots, x_{p} - a_{p}) . \end{matrix}

A first-order Taylor expansion gives

\begin{matrix} (A2) & y \approx f (x_{1}, \dots, x_{p}) - \sum_{k = 1}^{p} [\frac{\partial f (x_{1}, \dots, x_{p})}{\partial x_{k}} a_{k}] \end{matrix}

such that the variance of f(x) is $Var (f (x)) \approx d f (x)^{'} Σ_{a a} d f (x)$ . The first-order Taylor expansion will be exact when f(x) is a linear function.

Appendix B: Notation for covariance

Covariance can be given in plain or scaled format, e.g.,

\begin{matrix} (B1) & Σ = σ^{2} S . \end{matrix}

So, just as an example, a naive covariance matrix for B could be written as follows:

\begin{matrix} (B2) & Σ_{B} = (X^{*'} Σ^{- 1} X^{*})^{- 1} = σ^{2} (X^{*'} S^{- 1} X^{*})^{- 1} . \end{matrix}

In this paper I use the plain format because it simplifies many expressions.

Appendix C: The WTLS likelihood and its mean B_wtls

The target function for WTLS essentially consists of the quadratic component of the log-likelihood function:

\begin{matrix} (C1) & \begin{aligned} ϕ_{wtls} ([B, X^{*}]) & = {(\begin{array}{c} vec (Z) - [\begin{array}{c} Y^{*} \\ vec (X^{*}) \end{array}] \end{array})}^{'} \\ \cdot Σ_{Q}^{- 1} (\begin{array}{c} vec (Z) - [\begin{array}{c} Y^{*} \\ vec (X^{*}) \end{array}] \end{array}), \end{aligned} \end{matrix}

where $Y^{*} = [\begin{array}{cc} 1_{n} & X^{*} \end{array}] B$ . For $\partial ϕ / \partial B$ , Eq. (C1) reduces to

\begin{array}{l} (C2a) & \frac{\partial ϕ_{wtls}}{\partial B} & = \frac{\partial [(Y - b_{0} - X^{*} B^{*})^{'} Ω_{V}^{- 1} (Y - b_{0} - X^{*} B^{*})]}{\partial B} \\ (C2b) & = 2 {\tilde{X}}^{'} Ω_{V}^{- 1} V, \end{array}

where $\tilde{X} = [\begin{array}{cc} 1_{n} & X^{*} \end{array}]$ . Next substitute V with Eq. (18):

\begin{matrix} (C3) & \frac{\partial ϕ_{wtls}}{\partial B} = 2 {\tilde{X}}^{'} W^{- 1} e . \end{matrix}

The residual vector e can be written as

\begin{matrix} (C4) & e = (Y - b_{0} - A B^{*}) - X^{*} B^{*} \end{matrix}

\begin{matrix} (C5) & e = Y - X B . \end{matrix}

Then setting Eq. (C3) to zero, substituting for e, and rearranging gives

\begin{matrix} (C6) & \begin{aligned} B_{wtls} & = {[(X - A)^{'} W^{- 1} (X - A)]}^{- 1} \\ \cdot [(X - A)^{'} W^{- 1} (Y - A B)] \end{aligned} \end{matrix}

\begin{matrix} (C7) & B_{wtls} = {[(X - A)^{'} W^{- 1} X]}^{- 1} [(X - A) W^{- 1} Y], \end{matrix}

where A now may include a column of zeros (corresponding to noiseless predictors in X).

Equation (C7) expands to

\begin{matrix} (C8) & B_{wtls} = {[X^{'} W^{- 1} X - A^{'} W^{- 1} X]}^{- 1} [X^{'} W^{- 1} Y - A^{'} W^{- 1} Y], \end{matrix}

where ( $A^{'} W^{- 1} X$ ) is the weighted covariance between A and the predictors, and ( $A^{'} W^{- 1} Y$ ) is the weighted covariance between A and the response vector. This shows that WTLS can be expressed as an adjusted least squares problem. Hence, it can also be expressed as an augmented linear regression:

\begin{matrix} (C9) & Y - A B = (X - A) B + (V - A B), \end{matrix}

which is equivalent to Eq. (7).

Appendix D: Covariance of B_wtls

One approximation for Σ_B is as follows. Let

\begin{matrix} (D1) & B = K Y = K (X B + e) . \end{matrix}

If KX=𝕀, then

\begin{matrix} (D2) & Σ_{B} = K Σ K^{'}, \end{matrix}

where $Σ = Diag (e \otimes e)$ . For B_wtls, from Eqs. (D1) and (C7) it follows that $K = {[(X - A)^{'} W^{- 1} X]}^{- 1} (X - A)^{'} W^{- 1}$ .
Therefore,

\begin{matrix} (D3) & Σ_{B}^{wtls} = {[(X - A)^{'} W^{- 1} X]}^{- 1} M {[(X - A)^{'} W^{- 1} X]}^{- 1^{'}}, \end{matrix}

where $M = (X - A)^{'} W^{- 1} Σ W^{- 1} (X - A)$ .

Appendix E: Construction of Σ_A and Ω_V

For two predictors, Σ_A is

\begin{matrix} (E1) & Σ_{A} = [\begin{array}{cc} Ω_{A_{1}} & Ω_{A_{1} A_{2}} \\ Ω_{A_{2} A_{1}} & Ω_{A_{2}} \end{array}] . \end{matrix}

Σ_A can be constructed as

\begin{matrix} (E2) & Σ_{A} = Diag (σ_{A})^{'} [\begin{array}{cc} R_{A_{1}} & R_{A_{1} A_{2}} \\ R_{A_{2} A_{1}} & R_{A_{2}} \end{array}] Diag (σ_{A}), \end{matrix}

where Diag(⋅) is a diagonal matrix, and σ_A is a (n⋅p)-length vector containing the heteroscedastic “variance” of the predictor noise for each predictor (stacked vertically). The central block matrix, made up of submatrices R, will be labeled $\tilde{R}$ . The submatrices of $\tilde{R}$ are n×n correlation matrices, containing the autocorrelation or cross-correlation information for the predictor noise A:

\begin{matrix} (E3) & R = [\begin{array}{cccccc} ρ_{0} & ρ_{1} & ρ_{2} & ρ_{3} & \dots & ρ_{n - 1} \\ ρ_{1} & ρ_{0} & ρ_{1} & ρ_{2} & \dots & ρ_{n - 2} \\ ρ_{2} & ρ_{1} & ρ_{0} & ρ_{1} & \dots & ρ_{n - 3} \\ ρ_{3} & ρ_{2} & ρ_{1} & ρ_{0} & \dots & ρ_{n - 3} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ ρ_{n - 1} & ρ_{n - 2} & ρ_{n - 3} & ρ_{n - 4} & \dots & ρ_{0} \end{array}], \end{matrix}

where $(ρ_{0}, ρ_{1}, \dots, ρ_{n - 1})$ is the auto- or cross-correlation function.

If R is an autocorrelation matrix, then ρ₀=1. Also, because the auto- or cross-correlation may become ∼0 at long lags, then ρ_>k may be set to 0, for a chosen lag k. To ensure that $\tilde{R}$ is positive definite, the algorithm of Rebonato and Jaeckel (1999) is computationally fast.

Ω_V can be similarly constructed:

\begin{matrix} (E4) & Ω_{V} = Diag (σ_{V})^{'} R_{V} Diag (σ_{V}) . \end{matrix}

Appendix F: Maximum likelihood solution of Hannart et al. (2014)

Hannart et al. (2014) defined the log-likelihood of the EIV model as (here written using the notation in this paper, rather than the notation of Hannart et al., 2014)

\begin{matrix} (F1) & \begin{aligned} ℓ ([B^{*}, X^{*}]) & = (Y - b_{0} - X^{*} B^{*})^{'} Ω_{V}^{- 1} (Y - b_{0} - X^{*} B^{*}) \\ + \sum_{i} (X_{i} - X_{i}^{*})^{'} Ω_{A_{i}}^{- 1} (X_{i} - X_{i}^{*}), \end{aligned} \end{matrix}

which is a reduced form of Eq. (C1), because the off-diagonal matrices $Ω_{A_{i} A_{j}}$ in Σ_A are not included. Hannart et al. (2014) wrote an iterative (fixed-point) solution in terms of X^* and B^*.

\begin{array}{l} (F2a) & X_{i}^{*} & = (Ω_{A_{i}}^{- 1} + b_{i}^{2} Ω_{V}^{- 1})^{- 1} (b_{i} Ω_{V}^{- 1} {\bar{y}}_{i} + Ω_{A_{i}}^{- 1} X_{i}) \\ (F2b) & B^{*} & = (X^{*^{'}} Ω_{V}^{- 1} X^{*})^{- 1} X^{*^{'}} Ω_{V}^{- 1} (Y - b_{0}), \end{array}

where here $B^{*} = [b_{1} b_{2}]^{'}$ , and ${\bar{y}}_{i} = Y - b_{0} - X_{- i}^{*} B_{- i}^{*}$ . (I have included the intercept b₀ explicitly.) Confidence intervals for each b_i were calculated from the likelihood profiles (see Sect. 3 in Hannart et al., 2014).

The ML solution I provide here differs from Hannart et al. (2014), but I adopt a similar form (what follows is only for the ML estimate in the first row of Table 3). I express the log-likelihood in Eq. (F1) as a function of V and A:

\begin{matrix} (F3) & ℓ (B) = V^{'} Ω_{V}^{- 1} V + \sum_{i} A_{i} Ω_{A_{i}}^{- 1} A_{i}, \end{matrix}

where A and V are calculated using Eqs. (17)–(18). The difference with Hannart et al. (2014) is that the likelihood here is rewritten with respect to $e = Y - X B$ (through Eqs. 17–18), whereas the procedure of Hannart et al. (2014) is expressed with respect to $Y - X^{*} B^{*}$ , i.e., the main difference being X or X^*. The point here is merely to provide a basic ML solution in Table 3, in order to compare with the other solutions. Equation (F3) should work with many standard ML packages, because there is no iterative dependence on X^*.

The first row in Table 3 was calculated using Eq. (F3), and the Julia package ProfileLikelihoods.jl (VandenHeuvel, 2022).

Appendix G: MEPSM and paleodata assimilation

This appendix explores the potential advantages of MEPSM for paleodata assimilation (PDA). This is a stand-alone appendix, and notation may differ from the main text. The following notation is used for the data assimilation components:

x^b is the climate model data, at m grid points.
y is the proxy data, at p grid points.
θ is the unobserved state vector, with m grid points.
H is the operator which converts model space to proxy space.

Matrix H is p×m. (Note that if a proxy system model has multiple inputs, then H can be formed by concatenation, e.g., H=[H_sst H_sss] in which case m above becomes 2m.).

Data assimilation can be expressed in GLS format:

\begin{array}{l} (G1a) & y & = H θ + r \\ (G1b) & x^{b} & = θ + ε, \end{array}

where $r \sim N (0, Σ_{r r})$ and $ε \sim N (0, Σ_{ε ε})$ . In the typical offline PDA, Σ_εε is an m×m background covariance matrix. In MEPSM, the predictor noise matrix is Σ_A. Out of interest, a time-constant noise matrix could be specified as $Σ_{A} = Σ_{a a} \otimes I_{n}$ . Note the difference between Σ_εε and Σ_aa: Σ_εε is estimated from climate model ensembles, whereas Σ_aa is from ensemble climate data reconstructions (e.g., an ensemble SST product). Also, PDA is about estimating θ, whereas MEPSM is about calibrating H (in MEPSM B contains the non-zero elements of H). Note that H is typically calibrated using observed modern climate data and proxies, over a coeval period (e.g., the 20th century). I will say more about H below.

Next concatenate the two equations above into one:

\begin{matrix} (G2) & z = [\begin{array}{c} y \\ x^{b} \end{array}] = \tilde{H} θ + q, \end{matrix}

where $\tilde{H} = [\begin{array}{c} H \\ I_{m} \end{array}]$ , and let $Σ_{q q} = [\begin{array}{cc} Σ_{r r} & 0 \\ 0 & Σ_{ε ε} \end{array}]$ . The GLS solution for θ is

\begin{matrix} (G3) & \begin{aligned} θ^{a} & = {({\tilde{H}}^{'} Σ_{q q}^{- 1} \tilde{H})}^{- 1} {\tilde{H}}^{'} Σ_{q q}^{- 1} z \\ = {(H^{'} Σ_{r r}^{- 1} H + Σ_{ε ε}^{- 1})}^{- 1} (H^{'} Σ_{r r}^{- 1} y + Σ_{ε ε}^{- 1} x^{b}) . \end{aligned} \end{matrix}

For the analysis error covariance Σ^a, it is

\begin{matrix} (G4) & Σ^{a} = {(H^{'} Σ_{r r}^{- 1} H + Σ_{ε ε}^{- 1})}^{- 1} . \end{matrix}

Equation (G4) can also be expanded as

\begin{matrix} (G5) & \begin{aligned} Σ^{a} & = Σ^{a} (Σ^{a})^{- 1} Σ^{a} = Σ^{a} (H^{'} Σ_{r r}^{- 1} H + Σ_{ε ε}^{- 1}) Σ^{a} \\ = K Σ_{r r} K^{'} + Σ^{a} Σ_{ε ε}^{- 1} Σ^{a}, \end{aligned} \end{matrix}

where $K = Σ^{a} H^{'} Σ_{r r}^{- 1}$ . The question here is what happens if H is calibrated using a biased method (e.g., OLS). Equation (G4) suggests that Σ^a will also be biased. Ensemble filtering methods are used for their computational advantage: replacing a large m×m Σ_εε with a smaller ensemble of model x^b vectors. But the model x^b vectors are filtered in a way that ensures the resulting ensemble of δθ^a (i.e., deviation from ${\bar{θ}}^{a}$ ) agrees with Eq. (G5). So ensemble filtering methods do not help if H is biased. Equation (G1) shows that the operator H is applied to the unobserved state θ, not to the “observed” x^b, so using OLS for calibrating H seems flawed. OLS calibrates a PSM on observed (but not true) data, whereas MEPSM calibrates a PSM on the unobserved state x^*.

Code and data availability

Intuitive Julia code for MEPSM is available on GitHub (https://github.com/Mattriks/MeasurementErrorModels.jl, last access: 2 April 2023) and Zenodo (https://doi.org/10.5281/zenodo.7793741, Fischer, 2023 a). A manual for MEPSM is available at https://mattriks.github.io/MeasurementErrorModels.jl/dev/ (Fischer, 2023 b). The manual contains three examples which reproduce results from this paper.

The ERSSTv5 sea surface temperature data are available at https://psl.noaa.gov/data/gridded/data.noaa.ersst.v5.html (last access: 16 December 2019) (metadata: https://doi.org/10.7289/V5T72FNM, Huang et al., 2017). The ERSSTv5 uncertainty data were obtained from Boyin Huang, NOAA (22 October 2020).

The HadEN4 ocean salinity data (and uncertainty) are available at https://apdrc.soest.hawaii.edu/las/v6/dataset?catitem=16652 (Gouretski and Reseghetti, 2020). These data are from version EN.4.2.1 and are the G10 analyses, i.e., with the corrections of Gouretski and Reseghetti (2010).

The coral δ¹⁸O data from core RI6 were extracted from the CoralHydro2k database (Walter et al., 2023). A Jupyter Notebook (as a PDF) showing the preprocessing of these data is available in the Supplement. The dataset stored for the code examples in the GitHub repository contains all three detrended variables (detrended without removing the intercept).

Some input variables for PSMs, from the iCESM1 last millennium simulation (Brady et al., 2019), are available at the University of Washington (https://atmos.washington.edu/~rtardif/LMR/prior, Tardif, 2024). The variables ocean surface temperature (tos_sfc) and ocean surface salinity (sos_sfc) are used in Sect. 3.3.

A notebook and code module for Sect. 3.3 are available on Zenodo (https://doi.org/10.5281/zenodo.12660485, Fischer, 2024).

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/gmd-17-6745-2024-supplement.

Competing interests

The author has declared that there are no competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

I thank the reviewers for useful suggestions that helped to improve the manuscript.

Review statement

This paper was edited by Marko Scholze and reviewed by two anonymous referees.

References

Amiri-Simkooei, A. and Jazaeri, S.: Weighted total least squares formulated by standard least squares theory, Journal of Geodetic Science, 2, 113–124, https://doi.org/10.2478/v10156-011-0036-5, 2012. a, b

Brady, E., Stevenson, S., Bailey, D., Liu, Z., Noone, D., Nusbaumer, J., Otto-Bliesner, B. L., Tabor, C., Tomas, R., Wong, T., Zhang, J., and Zhu, J.: The Connected Isotopic Water Cycle in the Community Earth System Model Version 1, J. Adv. Model. Earth Sy., 11, 2547–2566, https://doi.org/10.1029/2019MS001663, 2019. a, b

Dee, S.: PRYSM: An open-source framework for PRoxY System Modeling, with applications to oxygen-isotope systems, Adv. Model. Earth Sy., 7, 1220–1247, https://doi.org/10.1002/2015MS000447, 2015. a

Delcroix, T., Alory, G., Cravatte, S., Corrège, T., and McPhaden, M. J.: A gridded sea surface salinity data set for the tropical Pacific with sample applications (1950–2008), Deep-Sea Res. Pt. I, 58, 38–48, https://doi.org/10.1016/j.dsr.2010.11.002, 2011. a

DeLong, K. L., Atwood, A., Moore, A., and Sanchez, S.: Clues from the sea paint a picture of Earth's water cycle, Eos, 103, https://doi.org/10.1029/2022EO220231, 2022. a

Elbern, H., Friese, E., Nieradzik, L., and Schwinger, J.: Data assimilation in atmospheric chemistry and air quality, in: Advanced Data Assimilation for Geosciences: Lecture Notes of the Les Houches School of Physics: Special Issue, June 2012, Oxford University Press, https://doi.org/10.1093/acprof:oso/9780198723844.003.0022, 2014. a

Evans, M. N., Tolwinski-Ward, S. E., Thompson, D. M., and Anchukaitis, K. J.: Applications of proxy system modeling in high resolution paleoclimatology, Quaternary Sci. Rev., 76, 16–28, https://doi.org/10.1016/j.quascirev.2013.05.024, 2013. a

Fang, X., Li, B., Alkhatib, H., Zeng, W., and Yao, Y.: Bayesian Inference for the Errors-In-Variables model, Stud. Geophys. Geod., 61, 35–52, https://doi.org/10.1007/s11200-015-6107-9, 2017. a, b

Fischer, M.: Mattriks/MeasurementErrorModels.jl: MEPSM v0.2.0, Zenodo [code], https://doi.org/10.5281/zenodo.7793741, 2023a. a

Fischer, M.: Measurement Error Proxy System Models: MEPSM v0.2, Model documentation, https://mattriks.github.io/MeasurementErrorModels.jl/dev/ (last access: 2 April 2023), 2023b. a

Fischer, M.: Data assimilation notebook and code module, Zenodo [code], https://doi.org/10.5281/zenodo.12660485, 2024. a, b

Florens, J.-P., Mouchart, M., and Richard, J.-F.: Bayesian inference in Errors-in-Variables Models, J. Multivariate Anal., 4, 419–452, https://doi.org/10.1007/s11200-015-6107-9, 1974. a

Fuller, W. A.: Chapter 3: Extensions of the single relation model, in: Measurement Error Models, John Wiley & Sons, https://doi.org/10.1002/9780470316665.ch3, 1987. a

Fuller, W. A.: Prediction of true values for the measurement error model, in: Statistical Analysis of Measurement Error Models and Applications, edited by: Brown, P. J. and Fuller, W. A., American Mathematical Society, 41–58, https://doi.org/10.1090/conm/112/1087098, 1990. a

Good, S. A., Martin, M. J., and Rayner, N. A.: EN4: Quality controlled ocean temperature and salinity profiles and monthly objective analyses with uncertainty estimates, J. Geophys. Res.-Oceans, 118, 6704–6716, https://doi.org/10.1002/2013JC009067, 2013. a

Gouretski, V. and Reseghetti, F.: On depth and temperature biases in bathythermograph data: Development of a new correction scheme based on analysis of a global ocean database, Deep-Sea Res. Pt. I, 57, 812–833, https://doi.org/10.1016/j.dsr.2010.03.011, 2010. a

Gouretski, V. and Reseghetti, F.: HadEN4 salinity and salinity error standard deviation data, Asia-Pacific Data-Research Center [data set], https://apdrc.soest.hawaii.edu/las/v6/dataset?catitem=16652, last access: 11 November 2020. a

Hannart, A., Ribes, A., and Naveau, P.: Optimal fingerprinting under multiple sources of uncertainty, Geophys. Res. Lett., 41, 1261–1268, https://doi.org/10.1002/2013GL058653, 2014. a, b, c, d, e, f, g, h, i, j, k, l

Henderson, H. V. and Searle, S. R.: Vec and vech operators for matrices, with some uses in jacobians and multivariate statistics, Can. J. Stat., 7, 65–81, https://doi.org/10.2307/3315017, 1979. a

Huang, B., Thorne, P. W., Banzon, V. F., Boyer, T., Chepurin, G., Lawrimore, J. H., Menne, M. J., Smith, T. M., Vose, R. S., and Zhang, H.-M.: NOAA Extended Reconstructed Sea Surface Temperature (ERSST), Version 5, NOAA National Centers for Environmental Information [data set], https://doi.org/10.7289/V5T72FNM, 2017. a

Huang, B., Menne1, M. J., Boyer, T., Freeman, E., Gleason, B. E., Lawrimore, J. H., Liu, C., Rennie III, J. J., C. J. S., Sun, F., Vose1, R., Williams, C. N., Yin, X., and Zhang, H.-M.: Uncertainty Estimates for Sea Surface Temperature and Land Surface Air Temperature in NOAAGlobalTemp Version 5, J. Climate, 33, 1351–1379, https://doi.org/10.1175/JCLI-D-19-0395.1, 2020. a

Kalnay, E., Kanamitsu, M., Kistler, R., Collins, W., Deaven, D., Gandin, L., Iredell, M., Saha, S., White, G., Woollen, J., Zhu, Y., Chelliah, M., Ebisuzaki, W., Higgins, W., Janowiak, J., Mo, K. C., Ropelewski, C., Wang, J., Leetmaa, A., Reynolds, R., Jenne, R., and Joseph, D.: The NCEP/NCAR 40-Year Reanalysis Project, B. Am. Meteorol. Soc., 77, 437–471, https://doi.org/10.1175/1520-0477(1996)077<0437:TNYRP>2.0.CO;2, 1996. a

King, J., Tierney, J., Osman, M., Judd, E. J., and Anchukaitis, K. J.: DASH: a MATLAB toolbox for paleoclimate data assimilation, Geosci. Model Dev., 16, 5653–5683, https://doi.org/10.5194/gmd-16-5653-2023, 2023. a, b, c

LeGrande, A. N. and Schmidt, G. A.: Global gridded data set of the oxygen isotopic composition in seawater, Geophys. Res. Lett., 33, L12604, https://doi.org/10.1029/2006GL026011, 2006. a

Lemoine, N. P.: Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses, Oikos, 128, 912–928, https://doi.org/10.1111/oik.05985, 2019. a

Loope, G., Thompson, D. M., Cole, J., and Overpeck, J.: Is there a low-frequency bias in multiproxy reconstructions of tropical paciﬁc SST variability?, Quaternary Sci. Rev., 246, 106530, https://doi.org/10.1016/j.quascirev.2020.106530, 2020a. a

Loope, G., Thompson, D. M., and Overpeck, J.: The spectrum of Asian Monsoon variability: a proxy system model approach to the hydroclimate scaling mismatch, Quaternary Sci. Rev., 240, 106362, https://doi.org/10.1016/j.quascirev.2020.106362, 2020b. a

Ludwig, K. R. and Titterington, D. M.: Calculation of ²³⁰ThU isochrons, ages, and errors, Geochim. Cosmochim. Ac., 58, 5031–5042, https://doi.org/10.1016/0016-7037(94)90229-1, 1994. a

Magnus, J. R. and Neudecker, H.: Symmetry, 0-1 Matrices and Jacobians, Econometric Theory, 2, 157–190, https://doi.org/10.1017/S0266466600011476, 1986. a

Malevich, S. B., Vetter, L., and Tierney, J. E.: Global Core Top Calibration of δ18O in Planktic Foraminifera to Sea Surface Temperature, Paleoceanogr. Paleocl., 34, 1292–1315, https://doi.org/10.1029/2019PA003576, 2019. a

Osborne, M. C., Dunbar, R. B., and Mucciarone, D. A.: Regional calibration of coral-based climate reconstructions from Palau, West Pacific Warm Pool, Palaeogeogr. Palaeocl., 386, 308–320, https://doi.org/10.1016/j.palaeo.2013.06.001, 2013. a, b, c, d, e, f

Rebonato, R. and Jaeckel, P.: The most general methodology to create a valid correlation matrix for risk management and option pricing purposes, J. Risk, 2, 17–27, https://doi.org/10.21314/JOR.2000.023, 1999. a

Sanchez, S. C., Hakim, G. J., and Sanger, C. P.: Climate Model Teleconnection Patterns Govern the Niño-3.4 Response to Early Nineteenth-Century Volcanism in Coral-Based Data Assimilation Reconstructions, J. Climate, 34, 1863–1880, https://doi.org/10.1175/JCLI-D-20-0549.1, 2021. a, b

Sayani, H. R., Cobb, K. M., DeLong, K., Hitt, N. T., and Druffel, E. R. M.: Intercolony δ¹⁸O and Sr/Ca variability among Porites spp. corals at Palmyra Atoll: Toward more robust coral-based estimates of climate, Geochem. Geophys. Geosyst., 20, 5270–5284, https://doi.org/10.1029/2019GC008420, 2019. a

Schneider, T.: Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values, J. Climate, 14, 853–871, https://doi.org/10.1175/1520-0442(2001)014<0853:AOICDE>2.0.CO;2, 2001. a

Steiger, N. J., Smerdon, J. E., Cook, E. R., and Cook, B. I.: A reconstruction of global hydroclimate and dynamical variables over the Common Era, Sci. Data, 5, 180086, https://doi.org/10.1038/sdata.2018.86, 2018. a, b

Tardif, R.: The post-processed “iCESM1” last millennium simulation, atmos.washington [data set], https://atmos.washington.edu/~rtardif/LMR/prior, last access: 14 June 2024. a

Tardif, R., Hakim, G. J., Perkins, W. A., Horlick, K. A., Erb, M. P., Emile-Geay, J., Anderson, D. M., Steig, E. J., and Noone, D.: Last Millennium Reanalysis with an expanded proxy database and seasonal proxy modeling, Clim. Past, 15, 1251–1273, https://doi.org/10.5194/cp-15-1251-2019, 2019. a, b

Thompson, D. M., Conroy, J. L., Konecky, B. L., Stevenson, S., DeLong, K. L., McKay, N., Reed, E., Jonkers, L., and Carré, M.: Identifying Hydro-Sensitive Coral δ¹⁸O Records for Improved High-Resolution Temperature and Salinity Reconstructions, Geophys. Res. Lett., 49, e2021GL096153, https://doi.org/10.1029/2021GL096153, 2022. a, b, c, d

Tierney, J. E. and Tingley, M. P.: A Bayesian, spatially-varying calibration model for the TEX₈6 proxy, Geochim. Cosmochim. Ac., 127, 83–106, https://doi.org/10.1016/j.gca.2013.11.026, 2014. a

Tierney, J. E. and Tingley, M. P.: BAYSPLINE: A New Calibration for the Alkenone Paleothermometer, Paleoceanogr. Paleocl., 33, 281–301, https://doi.org/10.1002/2017PA003201, 2018. a

Tierney, J. E., Malevich, S. B., Gray, W., Vetter, L., and Thirumalai, K.: Bayesian Calibration of the Mg/Ca Paleothermometer in Planktic Foraminifera, Paleoceanogr. Paleocl., 34, 2005–2030, https://doi.org/10.1029/2019PA003744, 2019. a

VandenHeuvel, D.: DanielVandH/ProfileLikelihood.jl: v0.2.1, Zenodo [code], https://doi.org/10.5281/zenodo.7493449, 2022. a

Walter, R. M., Sayani, H. R., Felis, T., Cobb, K. M., Abram, N. J., Arzey, A. K., Atwood, A. R., Brenner, L. D., Dassié, É. P., DeLong, K. L., Ellis, B., Emile-Geay, J., Fischer, M. J., Goodkin, N. F., Hargreaves, J. A., Kilbourne, K. H., Krawczyk, H., McKay, N. P., Moore, A. L., Murty, S. A., Ong, M. R., Ramos, R. D., Reed, E. V., Samanta, D., Sanchez, S. C., Zinke, J., and the PAGES CoralHydro2k Project Members: The CoralHydro2k database: a global, actively curated compilation of coral δ18O and $Sr / Ca$ proxy records of tropical ocean hydrology and temperature for the Common Era, Earth Syst. Sci. Data, 15, 2081–2116, https://doi.org/10.5194/essd-15-2081-2023, 2023. a, b

Watanbe, T. K. and Pfeiffer, M.: A Simple Monte Carlo Approach to Estimate the Uncertainties of SST and δ¹⁸Osw Inferred From Coral Proxies, Geochem. Geophys. Geosyst., 23, e2021GC009813, https://doi.org/10.1029/2021GC009813, 2022. a, b

York, D.: Least-squares fitting of a straight line, Can. J. Phys., 44, 1079–1086, https://doi.org/10.1139/p66-090, 1966. a

York, D.: Least-squares fitting of a straight line with correlated errors, Earth Planet. Sci. Lett., 5, 320–324, https://doi.org/10.1016/S0012-821X(68)80059-7, 1968. a