The synthesis of model and observational information using data assimilation
can improve our understanding of the terrestrial carbon cycle, a key
component of the Earth's climate–carbon system. Here we provide a data
assimilation framework for combining observations of solar-induced
chlorophyll fluorescence (SIF) and a process-based model to improve estimates
of terrestrial carbon uptake or gross primary production (GPP). We then
quantify and assess the constraint SIF provides on the uncertainty in global
GPP through model process parameters in an error propagation study. By
incorporating 1 year of SIF observations from the GOSAT satellite, we find
that the parametric uncertainty in global annual GPP is reduced by 73 %
from

The productivity of the terrestrial biosphere forms a key component of
Earth's climate–carbon system. Estimates show that the terrestrial biosphere
has removed about one quarter of all anthropogenic

A key challenge is disaggregating the observable net

At the leaf scale chlorophyll fluorescence is emitted from photosystems I and
II during the light reactions of photosynthesis. These photosystems are
pigment–protein complexes that form the reaction centres for converting light
energy into chemical energy. It is in photosystem II (PSII) where
photochemistry, the process initiating photosynthetic electron transport and
leading to

At the canopy scale and beyond, the link appears simpler, exhibiting
ecosystem-dependent linear relationships

Data assimilation enables the use of observations and model information
together to produce a best estimate of the state and function of the system.
In the case of mechanistic models this is done by constraining the simulated
processes and their parameters. Such an approach has been applied to
terrestrial biosphere models to optimize model parameters and constrain the
uncertainty in carbon flux estimates in a number of studies

In this paper, we assess the ability of satellite SIF observations to
constrain the parametric uncertainty in simulated GPP in a terrestrial
biosphere model within a data assimilation system. This is termed an error
propagation study and is similar in concept to an observing system simulation
experiment or quantitative network design study

Under the linear Gaussian assumption, the uncertainty in a target quantity
(here, GPP) following the assimilation of the measured data (here, SIF) is
conditional only on the prior uncertainty, the uncertainty in the measured
data and the sensitivity of simulated quantities (SIF and GPP) to changes in
the parameters

We formulate this error propagation study in two stages: (i) optimization of parameter uncertainties and (ii) projection of the parameter uncertainties onto uncertainty in diagnostic GPP. Here, we outline the model used to simulate the observation (SIF) and the target quantity (GPP). We also outline the model parameter set describing these processes, the uncertainty in the observations and model forcing, and the general experimental set-up.

In order to incorporate an observation into a data assimilation system, we require
a model or “observation operator” that can simulate SIF, ideally providing a
process-based relationship between SIF and GPP. There are a few ways one
might formulate the observation operator. Evidence shows a strong linear
relationship between SIF and GPP at large spatial scales and relatively long
temporal scales

In this section we describe the newly developed terrestrial biosphere model
for simulating and assimilating SIF. The model is an integration of the
existing models BETHY (Biosphere Energy Transfer HYdrology)

BETHY is a process-based terrestrial biosphere model at the core of the
Carbon Cycle Data Assimilation System (CCDAS)

PFTs defined in BETHY and their abbreviations.

SCOPE is a vertical (1-D) integrated radiative transfer and energy balance
model with modules for photosynthesis and chlorophyll fluorescence

The canopy radiative transfer and photosynthesis schemes of BETHY have been
replaced by the corresponding schemes in SCOPE, including the components
required for the calculation of chlorophyll fluorescence at leaf and canopy
scales. The spatial resolution, vegetation (PFT) characteristics, leaf
growth, and carbon balance are handled by BETHY. SCOPE therefore takes in
climate forcing (meteorological and radiation data) and LAI from BETHY and
returns GPP. BETHY calculates the canopy water balance, leaf growth, and net
carbon fluxes, which will prove useful in future when assimilating other data
streams (e.g. atmospheric

In this error propagation study, information from the SIF observations is
used to constrain the uncertainty in the model process parameters. Parameters
can either be global or differentiated by PFT. Global parameters apply to
plants or soils everywhere, while PFT-dependent parameters enable
differentiation between physiological and leaf growth traits. Some key
parameters for this study such as the maximum carboxylation capacity
(

We expose 53 parameters from BETHY-SCOPE to the error propagation system (see
Table

There are 12 SCOPE parameters exposed, two of which are PFT-dependent
(

To calculate the uncertainty in parameter values following the constraint
provided by the observational information of SIF (i.e. the posterior
uncertainty), we propagate uncertainty from the observations onto the
parameters. In order to perform this, we utilize a probabilistic framework
where the state of information on parameters and observations is expressed by
their corresponding PDFs

For linear and weakly non-linear problems we can assume that Gaussian
probability densities propagate forward through to Gaussian distributed
simulated quantities

To calculate the posterior parameter covariance matrix
(

Formally,

The observational constraint introduces correlations into the posterior
parameter distributions; thus, posterior parameter uncertainties are not
wholly independent. Strong correlations in

Using the parameter covariance matrix we can assess how parameter
uncertainties propagate forward through the model onto uncertainty in GPP
using the Jacobian rule of probabilities, the same method outlined in

The uncertainty in the measured data (hereafter, data) is a critical
component in assessing the potential impact of an observing system on the
estimation of carbon fluxes. Data uncertainties in SIF used here are
calculated from the GOSAT satellite observations for 2010. These data are
obtained from the ACOS (Atmospheric

We assume that the observations are independent and have uncorrelated errors, that
is, they are distributed randomly. Assuming uncorrelated errors is, however,
likely to overestimate the information content, particularly if using the
standard error as the uncertainty. Although it has been used in recent
studies with satellite SIF

Through the aggregation of GOSAT grid cells to the model grid resolution, the
number of independent measurements is reduced. To account for this and
preserve the information content of the original GOSAT observations, the
uncertainty in a given model grid cell is, approximately, divided by the
square root of the number of GOSAT grid cells with SIF data that fall within
that model grid cell (

Therefore, the calculation of the SIF data uncertainties used here is
approximated by Eq. (

Uncertainty in SIF observations may also have a systematic component. A
known, potential systematic error in SIF stems from the zero-level offset
calculated during the retrieval. Any error in the calculated zero-level
offset will add to the measurement error. This radiometric correction is done
to prevent biases in the SIF retrieval

An additional source of uncertainty in model estimates of GPP is climate
forcing. As mentioned by

In this study BETHY-SCOPE is run for the year 2010 on the computationally
efficient, low-resolution spatial grid (7.5

SIF is simulated at 755 nm, the wavelength corresponding to the GOSAT retrieval frequency and near the OCO-2 retrieval frequency (757 nm). We focus upon the constraint by SIF measurements at 13:00 local time as it closely corresponds to the local overpass time of the SIF-observing satellites GOSAT and OCO-2. However, we also investigate the effect of using alternative SIF-observing times (e.g. the GOME-2 satellite overpass time) and multiple observing times simultaneously on the constraint of GPP.

First, we present the results from Eq. (

As described in the “Methods” section, a key metric for assessing the relative
uncertainty reduction, or effective constraint, is defined as

Parameters describing leaf composition (

Effective
constraint of BETHY-SCOPE model process parameters from SIF observations.
Only the parameter numbers are given; for the corresponding descriptions, see
Table

Correlation coefficients (

Varied effective constraint is seen for the leaf growth parameters
(parameters 18–34 in Table

Leaf physiological parameters (parameters 1–17 in
Table

Global canopy structure parameters (parameters 50–53 in
Table

With the observational constraint, correlations are introduced into the
posterior parameter distributions. We assess these correlations using
Eq. (

To assess the effect of incorporating a systematic error from the
observations into this analysis, we apply a seasonal

To assess the constraint imposed by SIF on simulated GPP, we compare the prior
and posterior uncertainty in GPP as calculated using Eq. (

Global GPP from the prior model is approximately 164

Annual observational uncertainty in SIF interpolated from GOSAT observations for 2010.

Prior parametric uncertainty in annual GPP.

To assess which parameters contribute to the uncertainty in GPP for the prior
and posterior, we can conduct a linear analysis of the uncertainty
contributions. Typically this technique can only be used for the prior as the
correlations in posterior parameter uncertainties, excluded from the linear
analysis, also contribute toward the overall constraint. However, we can
assess the contribution of these correlations to the constraint of GPP by
setting the off-diagonal elements in

Posterior parametric uncertainty in annual GPP.

Relative uncertainty reduction (i.e. effective constraint) of parametric uncertainty in annual GPP from prior to posterior.

Using a linear analysis of the uncertainty, we find that uncertainty in global
annual GPP in the prior and posterior stems from different processes. For the
prior we find that the uncertainty in GPP is dominated, at 89 %, by
parameters describing leaf growth processes. Of these, a single parameter,

For the posterior, which has a lower overall uncertainty in GPP, uncertainty
is dominated by parameters representing physiological processes.
Physiological parameters account for 67 % of the uncertainty in posterior
annual GPP, with

Regionally, we split the land into three regions, the Boreal region (above
45

During the start of the growing season leaf physiology, in particular
photosynthetic rate constants (

For the Tropics uncertainty reduction in GPP is about 80 % across the
year. Uncertainty in the prior is dominated by the leaf growth parameters and
in particular the

Contribution of parameter classes to parametric uncertainty in
monthly GPP for three regions (see Table

With this set-up it is possible to test how the SIF constraint on GPP might
change with alternative observational times. Considering this, we test how
the constraint on GPP changes when assimilating observations of SIF from
alternative times of the day, assuming the same number of observations and
the same observational uncertainty as used above. From this we see that
different observing times yield differences in the posterior uncertainty and
the effective constraint of GPP (see
Fig.

We also test the effect of utilizing SIF measurements at multiple times of the day simultaneously. We select the times 08:00, 12 noon, and 16:00, replicating a theoretical geostationary satellite. For this experiment we first test the effect of increasing the number of observations by a factor of 3, assuming the same uncertainty for the three observation times. Second, we also increase the number of observations by a factor of 3, but scale the variance of these observations by one third. Using this second test we can assess whether differences in parameter sensitivities of SIF and GPP at the different times of the day add value in the overall constraint.

Effective constraint on global annual GPP for different observing
times and the two diurnal cycle configurations. Values at the top of the bars
correspond to the posterior uncertainty (

Using a diurnal cycle of observations results in a posterior uncertainty of
4.6

In order to assess the effects of incorporating uncertainty in SWRad, we
conduct three experiments. First is a control run, equivalent to using SIF at
13:00 as before. The second includes uncertainty in SWRad by adding it into
the posterior uncertainty calculation; this might be done normally when
accounting for uncertainty in forcing. The third experiment incorporates
uncertainty in SWRad into the
error propagation system with SIF, such that the uncertainty in SWRad may be constrained.
This third experiment effectively treats SWRad as a model parameter by adding
an extra row and column to

Including the uncertainty in SWRad in the calculation of posterior
uncertainty in GPP results in an additional 0.03

Parametric uncertainty and effective constraint of global annual GPP
for each of the SWDown experiments. Prior and posterior values shown are the
1 standard deviation (

By assessing the prior and posterior uncertainty in SWRad in

The results presented show that with 1 year of satellite SIF data observed at
the GOSAT and OCO-2 satellite overpass time and SIF retrieval wavelength, we
can constrain a large portion of the BETHY-SCOPE parameter space and
ultimately yield a parametric uncertainty in global annual GPP of

We note that this analysis is likely to underestimate the constraint that SIF
could provide on GPP as it is performed with uncertainties calculated from
the GOSAT SIF 3

This error propagation analysis does not assess how model SIF compares with
observed SIF. However, our finding that the

We also find that the effect of incorporating the error
in the zero-level offset
correction in the SIF observations is negligible on posterior parametric
uncertainties. This may be negligible because, for a given season, this
systematic uncertainty applies across all data points; thus, it scales all of
the SIF values and therefore the sensitivities as well. In any case, the
systematic error in the zero-level offset-corrected data assessed here
(Fig.

The constraint on global GPP is similar when assimilating SIF at any time between 09:00 and 15:00. Assimilating observations at the daily maximum of SIF and GPP provides the strongest constraint as both quantities exhibit the strongest parameter sensitivities at these times. Depending upon the state of the vegetation and the environmental stress conditions, maximum SIF and GPP may occur anywhere between mid-morning and early afternoon. Therefore, we expect that the effective use of different satellite-retrieved SIF observations for assimilation studies will depend not so much on their observing time but more on the spatiotemporal resolution, measurement precision, and subsequent uncertainty.

A confounding factor in this expectation is the uncertain role of
physiological stress on the diurnal cycle of SIF and GPP and on modelling
capabilities of these processes. Multiple studies have shown that various
forms of environmental stress result in the downregulation of PSII and changes in
the fluorescence yield, particularly evident across the diurnal cycle

The constraint of SIF on GPP occurs via multiple processes including leaf
growth, leaf composition, physiology, and canopy structure. For the prior,
uncertainty in global GPP is dominated by leaf growth processes. There is a
clear and direct link between leaf growth processes and GPP

Of particular importance is the parameter describing water limitation on leaf
growth (

At the global scale,

The constraint SIF provides on leaf growth processes is also perhaps
achievable from other remote-sensing products such as FAPAR

The strong constraint SIF provides on leaf growth processes indicates that it
is likely to provide improved monitoring of key phenological processes such
as the timing of leaf onset, leaf senescence, and growing season length as
also suggested by

Beyond observing LAI dynamics SIF can also provide critical insights into
physiological processes

Chlorophyll content here constitutes a classic nuisance variable. A nuisance
variable is one that is not perfectly known and impacts the observations we
wish to use but not the target variable

Almost all terrestrial carbon cycle models use down-welling radiation at the
Earth's surface as an input variable. Any uncertainty in this forcing will
translate into uncertainty in carbon fluxes including GPP, and few studies
consider such uncertainties. A known systematic error (i.e. bias) in forcing
variables

The results presented here demonstrate how SIF observations may be utilized
to optimize a process-based terrestrial biosphere model and constrain
uncertainty in simulated GPP. These results are, however, model dependent.
The assumption is that the model simulates the most important processes
driving SIF and GPP. Some key, remaining unknowns include how processes such
as environmental stress, three-dimensional canopy structure effects, or
nitrogen cycling may affect the SIF signal. As better understanding is
developed of the role that these processes play, modelling capabilities will
also be improved. Additionally, a different set of prior parameter values
will alter the results due to changes in

We assessed the ability of satellite SIF observations to constrain
uncertainty in model parameters and uncertainty in spatiotemporal patterns of
simulated GPP using a process-based terrestrial biosphere model. The results
show that there is a strong constraint of parametric uncertainties across a
wide range of processes including leaf growth dynamics and leaf physiology
when assimilating just 1 year of SIF observations. Combined, the SIF
constraint on parametric uncertainties propagates through to a strong
reduction in uncertainty in GPP. The prior uncertainty in global annual GPP
is reduced by 73 % from 19.0 to 5.2

The BETHY-SCOPE model code is available upon request
from the authors. The GOSAT satellite SIF data used in this paper are from
the ACOS project (version B3.5)

BETHY-SCOPE process parameters along with
their prior and optimized uncertainties following SIF constraint, represented
as 1 standard deviation. Relative uncertainty reduction (i.e. effective
constraint) is reported for the error propagation with low-resolution and
high-resolution SIF observations. Units are as follows:

Continued.

To obtain the variance of a target grid cell at the model grid resolution
(

An example of the GOSAT SIF data and uncertainty calculations over a
low-resolution model grid cell centred over the Amazon forest at
3.75

grid cell sizes considering that SIF is in physical units per unit
area. We then sum the area-weighted variances and scale this uncertainty by
the square root of 2 (see Eq.

Analysis of systematic errors in the GOSAT SIF observations. We
assess the zero-level offset-corrected GOSAT SIF soundings over two
ice-covered and therefore non-fluorescent regions. The first is Antarctica in
January, between latitudes 70 and 80

The authors declare that they have no conflict of interest.

Alexander Norton was partly supported by an Australian Postgraduate Award provided by the Australian Government and a CSIRO OCE Scholarship. The research was funded, in part, by the ARC Centre of Excellence for Climate System Science (grant CE110001028). Edited by: Tomomichi Kato Reviewed by: Sylvain Kuppel and one anonymous referee