Estimating methane (CH

The detailed process model that we analyze contains descriptions for CH

The uncertainties related to the parameters and the modeled processes are
described quantitatively. At the process level, the flux measurement data are
able to constrain the CH

The hierarchical modeling allows us to assess the effects of some of the parameters on an annual basis. The results of the calibration and the cross validation suggest that the early spring net primary production could be used to predict parameters affecting the annual methane production.

Even though the calibration is specific to the Siikaneva site, the hierarchical modeling approach is well suited for larger-scale studies and the results of the estimation pave way for a regional or global-scale Bayesian calibration of wetland emission models.

Methane is the third most important gas in the atmosphere in terms of its
capacity to warm the climate, after water vapor and carbon dioxide, currently
with the radiative forcing of 0.97 W m

The sources of CH

The methane from wetlands is produced by prokaryotic archaea under anaerobic
conditions. The main sink for atmospheric CH

The wetlands in the boreal zone are a significant contributor to the total
CH

The need for improved wetland methane emission modeling is amplified by the
fact that although annual mean precipitation is projected to increase in the
boreal zone

Changes to hydrological conditions such as draining or recurring low water
table depth can alter the balance of greenhouse gas emissions

The HelsinkI Model of MEthane buiLd-up and emIssion for peatlands (HIMMELI)
is a relatively full-featured wetland/peatland CH

Even well-constructed computer models describing environmental processes
accumulate error at many levels

Several current CH

Methane models typically use measured values from field campaigns and
parameters estimated from those studies where applicable

Due to these uncertainties, values of parameters vary widely from research to
research. For instance, for the

Calibration done for the models is usually quite basic.

Using hierarchical modeling to estimate annually varying parameters is
sensible, since the flux measurement site has both properties that change
from year to year (e.g., small changes in vegetation, plant roots, and microbe
populations) and properties that are more permanent (e.g., peat quality and
plant species). With fixed parameter values for all years, the model
sometimes does not accurately and appropriately describe the observations. On
the other hand, with different parameters for all the years, the parameters
are easily overfitted, meaning that while the resulting model fits the data
well, it does not accurately predict future fluxes

In the present study, the sqHIMMELI model is calibrated using adaptive Markov
chain Monte Carlo (MCMC) and importance resampling techniques to evaluate a
hierarchical statistical model for the model parameters. The calibration is
done for the boreal Siikaneva site. This study complements the work in

Merely optimizing model parameters may lead to misleading results due to the
presence of several local minima in the objective function; for example,

The main objective of this work is to analyze the capabilities and
limitations of a modern feature-filled wetland CH

Methane and carbon dioxide flux measurements were needed for estimating the
model parameters, and for that purpose observational data from the Siikaneva
peatland flux measurement site in southern Finland (61

Description of the data used.

Measurement of ecosystem-scale gas fluxes started in 2005, and in this work
eddy covariance (EC) CH

For this study, daily means of CH

For using these carbon dioxide data with the cost function, the CO

The required inputs for sqHIMMELI are daily soil temperatures, water table
depths (WTDs), NPP, and leaf area indices (LAIs). The
soil temperature profile for the grid used was generated by interpolating
from measurement data between the measurement depths (

The HIMMELI model

The model simulates the processes in a discretized peat column. The number
and thickness of the peat layers can be varied, but in this work six 10 cm
layers are used, similarly to, e.g.,

At present, the model does not contain descriptions for processes related to
snow pack or ice such as diffusion through snow, or release of accumulated
gas bubbles under ice in springtime as described by, e.g.,

HIMMELI itself, as presented in

For each modeled process in sqHIMMELI, there are parameters regulating the
process, affecting the concentrations of CH

Methanogens prefer recently assimilated fresh carbon as their energy source,
for instance, the root exudates of vascular plants

The modified sqHIMMELI model contains an exudate pool description, from which
it produces methane (Eqs.

The second source of anaerobic respiration, the anaerobic peat decomposition,
is modeled in sqHIMMELI with a simple

The sqHIMMELI model differs from HIMMELI in the details regarding the root
distribution model. Compared to measurement data of root distributions of
aerenchymatous sedges from

The different root distribution descriptions. The original
description is shown as the decaying exponential, and the graph with discrete
steps shows measurement data from

With the Gaussian shape, the new root density decreases faster with depth.
Without this change, the optimization process calibrates the model to have
very high root masses below 50 cm underground. The other difference between
the models is that in the original model there are vanishingly few roots
below the depth of 1 m, but according to

Before starting the optimization, the parameters

Parameters that were not calibrated. Based on an initial sensitivity
analysis, the Michaelis–Menten parameters

Methane is produced from anaerobic peat decomposition at all peat depths in
the sqHIMMELI model, and its transport and oxidation affect the modeled
CH

The parameters for the optimization were chosen to constrain the processes
most important for the CH

where

Here,

Here,

where

where

where the terms are analogous to the ones in Eq. (

The version of HIMMELI presented here describes processes for CH

The gas concentrations of CH

The equations presented in this section are specific to the version of
HIMMELI used in this study. The version in

The carbon for methane production in this model version comes from two
sources: root exudates and anaerobic peat decomposition. The methane
production from anaerobic respiration of that carbon is given by the terms

Peat respiration (aerobic respiration) is described with an equation of the
Michaelis–Menten form:

The transport term

Residual histograms and autocorrelation functions of the error terms

Parameter limits and prior distribution parameters. The priors are
truncated Gaussian, with mean values

The model calibration consists of several steps but can be summarized as
first estimating the posterior with MCMC and then based on those results,
recalibrating the objective function and using this new formulation for
importance resampling. Importance resampling is typically used for obtaining
posterior distributions from minor changes to the objective function
descriptions

In more detail, first, a posterior estimate was drawn running
500 000 iterations of sqHIMMELI simulations with the adaptive Metropolis
Markov chain Monte Carlo algorithm with a Laplace-distributed error
description and a first-order autoregressive model, AR(1), for the residuals.
Second, for defining the more refined cost function for importance
resampling, the optimal order for an autoregressive moving average (ARMA) time series
model for the model residuals was identified from the maximum a posteriori
estimate by minimizing the Akaike and Bayesian information criteria with
respect to the model order. The third step was drawing a random sample of
size 50 from the posterior estimate obtained with MCMC, with which the
error model parameters

The need for the importance resampling arises from the fact that the
error-model-transformed methane and carbon dioxide residuals emerging from the maximum a
posteriori and posterior mean estimates from the calibration with the AR(1)
model are not fully independent and identically distributed. The
recalibration of the error model, and resampling from the simulated posterior
using importance resampling, remedies this problem, as can be seen in the
residual histogram and autocorrelation functions in Fig.

In order to be able to assess the annual parameter and CH

The “hyperparameters” are the means and variances defining the Gaussian
priors of the hierarchical parameters

Technically, a “Metropolis-within-Gibbs” method

MCMC chains showing a thinned sample of the half million values in
the chain. The first 70 % was discarded for the analyses as a warm up and
is grayed out in the figures. The hierarchical parameters in panels

Our empirical data for the hierarchical model were the 9 years from 2006
to 2014, meaning that for each of these years there were corresponding

As in many practical uncertainty quantification applications, a major part of
the parameter estimation problem is the proper definition of the objective
function. For MCMC, it is defined here based on a priori information about the
measurement uncertainties, based on information from the model residuals, and
based on additional prior information. For the importance resampling, we
modify the error model for the CO

The form of the objective function is the same for both MCMC and importance
resampling. The first two components of the objective function contain the
contributions from the modeled differences to the daily CH

When determining the parameters

Since the primary interest is in the methane fluxes, the carbon dioxide
residuals are scaled down to a fifth in the importance resampling
cost function, which is enough to guide the parameter values since several
years of CO

Posterior distributions of the parameters from the importance
sampling. The two-dimensional marginal distributions of the posterior
distribution are shown in the triangle on the lower left (labels on the left
and at the bottom), and the correlations between parameters are shown in the
upper triangle on the right (labels on the left and at the top). The images
in the lower left triangle show the 90 % (black), 50 % (red), and
10 % (blue) contours, and points from a random sample of the posterior
(black dots). On the upper right, each plot shows correlation coefficients
between parameters, color coded to show negative correlations in blue and
positive in red. The units are listed in
Table

Posterior distributions and correlations of the annual means of the output from the modeled processes for the year 2012. The dynamics for the other years are mostly similar, but the strengths of the correlations vary somewhat. The results shown are based on 1000 random samples from the parameter posterior distribution. The two-dimensional marginal distributions in the triangle on the lower left have their labels on the left and at the bottom, and the correlations between the processes in the upper triangle on the right have their labels on the left and at the top. The images in the lower left triangle show the 90 % (black), 50 % (red), and 10 % (blue) contours. The all-ebullition and diffusion fluxes correlate almost fully, showing that the “diffusion” flux has a strong contribution from underground ebullition.

The parameters affecting the CH

The parameter priors are set to zero outside prescribed bounds. Within these
bounds, the parameters are assigned Gaussian priors, with the exception of
one parameter whose prior is set to be flat. The prior values are based on
both literature and expert knowledge, and the information regarding the
parameter values is summarized in Table

The “objective function” for the parameter optimization,

Parameter values obtained in the optimization of the sqHIMMELI model
with importance resampling. The maximum a posteriori, posterior mean,
non-hierarchical mean (mean values used for hierarchically varying
parameters), and values from

The Markov chain Monte Carlo simulations yielded a chain of 500 000 samples.
From these, 70 % from the start of the chain were discarded as a warm up
(Fig.

Three different parameter estimates obtained from the posterior distribution
were used to look at its features and fluxes: the maximum a posteriori (MAP)
estimate, posterior mean estimate, and a non-hierarchical posterior mean
estimate, where the mean values of the parameters

The parameter values used in the analyses are shown in
Table

Contrary to this, the air diffusion rate coefficient,

The root distribution parameter,

Posterior marginal and prior distributions from MCMC and importance
resampling for all parameters: panels

The values of the exudate pool turnover time

The non-hierarchically optimized parameter,

Table

Fractions of the annual diffusive fluxes of the total fluxes. Means
and 1

Output CH

Output CO

Means of total CH

Diffusion, plant transport, ebullition, CH

Annual CH

Posterior marginal distributions of the hierarchical parameters from
both MCMC and importance sampling, along with the hyperpriors.
Panels

The parameter

The interannual variability of

Table

A cross validation of the regression modeling in terms of the annual errors
is shown in Figs.

The positive bias in the CO

All years of hierarchically optimized experiments show at least a small
negative annual bias in the methane flux when compared to the available
observations. This can be due to the high day-to-day variability of the
summertime fluxes, which dominate year-round total fluxes, and the fact that
the model can not, without data about the fine structure and heterogeneity of
the wetland, match the high variability fluxes. The proportional model–data
residual error component

Another reason is that the carbon dioxide fluxes are overestimated by the model, leading to need to balance between the two, and as methane production in the wetland also produces carbon dioxide, the optimization algorithm will find a middle ground between the conflicting needs of minimizing carbon dioxide and maximizing methane production.

Additionally, the wintertime methane fluxes are underestimated
systematically, and the emissions start slightly late in early summer, which
produces a negative bias to the total flux even though visually the fit is
good, as can be seen in Fig.

The carbon dioxide time series against flux observations are shown in
Fig.

The input data have a role in affecting the model fit to the data, and since
NPP is a modeled quantity, there is some additional uncertainty stemming from
that modeling involved. For LAI, we note that even though in reality it is not
identical every year, in the model, it follows the same pattern (see
Appendix

The sqHIMMELI model produces the CH

In the following, “all ebullition” refers to any ebullition in the peat
column regardless of whether the bubbles reach the peat column surface.
“Ebullition” refers to the part of all ebullition which reaches the
surface. Most of the time, the water table is under the peat surface, and at
those times ebullition is zero, although all ebullition can be
substantial. In that case, the ebullition flux does not go directly into the
atmosphere, but into the first air-filled peat layer above the water table
level, and continues from there via other pathways. The reason for this
separation comes from implementation details of HIMMELI. In all experiments,
ebullition reaching the surface is a minor fraction of the total CH

For the posterior mean estimate, the flux components and oxidation are shown
as time series in Fig.

Comparing results from simulations with optimized parameters to results using
the default parameter values (shown in Table

Figures

The NPP-based CH

The parameter

The year-to-year variation of the posterior distributions of the

The methane produced by the action of

The exudate pool size follows the net primary production in
Fig.

The methane production from decomposition of peat in anaerobic conditions is
aided by the rather strongly correlated parameters

Methane oxidation is quite steady between the different estimates as can be
seen in Fig.

The stronger oxidation with the default parameter values can be for its part
also linked to the larger

The process correlation figure (Fig.

The hard prior bounds of

The parameter

The amount of plant transport in the calibrated models, shown in
Fig.

Since the parameters

The masses of the diffusion coefficient parameters

Ebullition is very strongly tied to diffusion in the flux estimates with
parameters from the posterior, as is shown in Fig.

Contrasting with this, in the simulations with the non-hierarchically
optimized parameters, a major part of the diffusive flux, which comprises
around 30 % of the total flux for most years, is transported by
ebullition (Fig.

The priors of the hierarchical CH

The posterior distributions of

Whereas the fraction of plant transport is stable and high, but still
constrained, not all the parameters affecting root conductivity are
constrained by the data, as the root tortuosity posterior distribution
follows very closely the prior form. The root-ending cross-sectional area,
however, is constrained to its lower side despite there being mass also above
the prior mean value. For this parameter, the importance resampling resulted
in a changed posterior in that there is a lot more mass at the higher end of
the distribution, as can be seen in Fig.

The transport pathways are well identified, as can be seen in the ranges of
variation in the transport characteristics in Fig.

The calibrated sqHIMMELI model is able to describe the CH

Modeled CH

Compared to the estimate with the optimized annual variations of the
methane-production-related parameters, the non-hierarchical posterior mean estimate
produces reasonable flux estimates over the assessment period, with twice the
variability in fluxes compared to the posterior mean estimate, even though
the average of the errors is closer to zero. The variability is seen in
Fig.

In order to be able to utilize the information regarding the annual
variability in the posterior mean estimate for the future prediction of
CH

The analysis revealed that the mean soil temperature of the first 10 weeks
(70 days) of the year at the depth of 30–40 cm, denoted here by

The

The

For the other interannually changing parameter,

A leave-one-out cross validation (LOO-CV; see, e.g.,

As Fig.

In this study, Bayesian calibration of a new process-based wetland CH

For future studies, combining observations from several sites and optimizing them together with the methods presented here in conjunction with independent validation can provide valuable information about the uncertainties related to wetland emission modeling and about how to best improve the quality of predicting wetland methane emissions in land surface schemes of climate models.

The HIMMELI source code is available as a supplement to
the publication of

The model input data and the flux measurement data are available upon a reasonable request to the lead author.

In Sect.

For both CO

The MCMC experiment was performed with a cost function that permissively
allowed for exploration of the parameter space. The

The sum of the absolute values of the

For choosing the order of autoregressive moving average model (the
ARMA(

The scaling of the model residuals for choosing the ARMA parameters and the
values for

The ARMA(2,1) model parameters and the parameters

MCMC methods are a class of Bayesian methods that
can be used for obtaining the probability distribution

MCMC sampling starts by taking some starting value

According to Markov chain theory, the sampled parameter values will
eventually follow the “target distribution”

In practice, this means that with MCMC it is possible to find a good
approximation of the probability density function of the parameter vector

For efficient convergence of the chain to the posterior distribution, a good
estimate of

The hierarchical parameters

In Gibbs sampling, the full conditional posterior distributions of the
hyperparameters and the parameters

Draw

draw

draw the parameters

The means and variances obtained this way describe the interannual variability of the parameters, and not including them as parameters in the MCMC sampling reduces the dimension of space that the MCMC sampler needs to explore, speeding up convergence of the posterior distribution.

Importance resampling is a method for obtaining samples from a desired
(unnormalized) distribution

The samples

We estimated the net photosynthesis rate,

Both

The daily averages of

JS designed the study with help from the co-authors, programmed the algorithms, performed the model simulations, analyzed the results, and prepared the manuscript and the figures. MR provided and validated the input data and helped with the interpretation of the results. LB contributed several model subroutines and helped to interpret the results. ML provided assistance with getting the technical aspects of the Bayesian analysis right. OP provided insight into the data used. JM, TV, and TA provided helpful critical comments and suggestions that helped to improve the manuscript substantially.

The authors declare that they have no conflict of interest.

We would like to thank the University of Helsinki researchers Pavel Alekseychik and Ivan Mammarella, and Janne Rinne from Lund University for valuable comments and input regarding the Siikaneva measurement site data. We would also like to thank Heikki Haario from Lappeenranta University of Technology, Janne Hakkarainen from Finnish Meteorological Institute, and Samuli Siltanen from University of Helsinki for comments regarding the mathematical aspects of the study.

This work has been supported by the EU LIFE+ project MONIMET LIFE12 ENV/FI/000409, and the EU FP7 project EMBRACE. We additionally acknowledge funding from the RED platform of the Lappeenranta University of Technology and thank the Academy of Finland Center of Excellence (272041), The Strategic Research Council at the Academy of Finland project (312932), CARB-ARC (285630), ICOS Finland (281255), ICOS-ERIC (281250), NCoE eSTICC (57001), EU-H2020 CRESCENDO (641816), EU-H2020 VERIFY (776810), and Academy Professor projects (284701 and 282842). Edited by: David Lawrence Reviewed by: two anonymous referees