Articles | Volume 9, issue 10
Methods for assessment of models
04 Oct 2016
Methods for assessment of models |  | 04 Oct 2016

Consistent assimilation of multiple data streams in a carbon cycle data assimilation system

Natasha MacBean, Philippe Peylin, Frédéric Chevallier, Marko Scholze, and Gregor Schürmann

Abstract. Data assimilation methods provide a rigorous statistical framework for constraining parametric uncertainty in land surface models (LSMs), which in turn helps to improve their predictive capability and to identify areas in which the representation of physical processes is inadequate. The increase in the number of available datasets in recent years allows us to address different aspects of the model at a variety of spatial and temporal scales. However, combining data streams in a DA system is not a trivial task. In this study we highlight some of the challenges surrounding multiple data stream assimilation for the carbon cycle component of LSMs. We give particular consideration to the assumptions associated with the type of inversion algorithm that are typically used when optimising global LSMs – namely, Gaussian error distributions and linearity in the model dynamics. We explore the effect of biases and inconsistencies between the observations and the model (resulting in non-Gaussian error distributions), and we examine the difference between a simultaneous assimilation (in which all data streams are included in one optimisation) and a step-wise approach (in which each data stream is assimilated sequentially) in the presence of non-linear model dynamics. In addition, we perform a preliminary investigation into the impact of correlated errors between two data streams for two cases, both when the correlated observation errors are included in the prior observation error covariance matrix, and when the correlated errors are ignored. We demonstrate these challenges by assimilating synthetic observations into two simple models: the first a simplified version of the carbon cycle processes represented in many LSMs and the second a non-linear toy model. Finally, we provide some perspectives and advice to other land surface modellers wishing to use multiple data streams to constrain their model parameters.

Short summary
Model projections of the response of the terrestrial biosphere to anthropogenic emissions are uncertain, in part due to unknown fixed parameters in a model. Data assimilation can address this by using observations to optimise these parameter values. Using multiple types of data is beneficial for constraining different model processes, but it can also pose challenges in a DA context. This paper demonstrates and discusses the issues involved using toy models and examples from existing literature.