BUMPER v1.0: a Bayesian user-friendly model for palaeo-environmental reconstruction
- 1Earth, Environment and Ecosystems, The Open University, Walton Hall, Milton Keynes MK7 6AA, UK
- 2Department of Biology, University of Bergen, P.O. Box 7803, 5020 Bergen, Norway
- 3Environmental Change Research Centre, University College London, London WC1E 6BT, UK
- 4Department of Entomology, Natural History Museum, Cromwell Road, London SW7 5BD, UK
- 5Department of Biological Sciences, Florida Institute of Technology, 150 West University Boulevard, Melbourne, FL 32901, USA
- 6The Johns Hopkins University Applied Physics Laboratory, 11000 Johns Hopkins Road, Laurel, MD 20723, USA
Abstract. We describe the Bayesian user-friendly model for palaeo-environmental reconstruction (BUMPER), a Bayesian transfer function for inferring past climate and other environmental variables from microfossil assemblages. BUMPER is fully self-calibrating, straightforward to apply, and computationally fast, requiring ∼ 2 s to build a 100-taxon model from a 100-site training set on a standard personal computer. We apply the model's probabilistic framework to generate thousands of artificial training sets under ideal assumptions. We then use these to demonstrate the sensitivity of reconstructions to the characteristics of the training set, considering assemblage richness, taxon tolerances, and the number of training sites. We find that a useful guideline for the size of a training set is to provide, on average, at least 10 samples of each taxon. We demonstrate general applicability to real data, considering three different organism types (chironomids, diatoms, pollen) and different reconstructed variables. An identically configured model is used in each application, the only change being the input files that provide the training-set environment and taxon-count data. The performance of BUMPER is shown to be comparable with weighted average partial least squares (WAPLS) in each case. Additional artificial datasets are constructed with similar characteristics to the real data, and these are used to explore the reasons for the differing performances of the different training sets.