Articles | Volume 6, issue 6
Geosci. Model Dev., 6, 2087–2098, 2013
Geosci. Model Dev., 6, 2087–2098, 2013

Development and technical paper 16 Dec 2013

Development and technical paper | 16 Dec 2013

Correction of approximation errors with Random Forests applied to modelling of cloud droplet formation

A. Lipponen1, V. Kolehmainen1, S. Romakkaniemi1, and H. Kokkola2 A. Lipponen et al.
  • 1Department of Applied Physics, University of Eastern Finland, P.O. Box 1627, 70211 Kuopio, Finland
  • 2Finnish Meteorological Institute, Kuopio Unit, P.O. Box 1627, 70211 Kuopio, Finland

Abstract. In atmospheric models, due to their computational time or resource limitations, physical processes have to be simulated using reduced (i.e. simplified) models. The use of a reduced model, however, induces errors to the simulation results. These errors are referred to as approximation errors. In this paper, we propose a novel approach to correct these approximation errors. We model the approximation error as an additive noise process in the simulation model and employ the Random Forest (RF) regression algorithm for constructing a computationally low cost predictor for the approximation error. In this way, the overall simulation problem is decomposed into two separate and computationally efficient simulation problems: solution of the reduced model and prediction of the approximation error realisation. The approach is tested for handling approximation errors due to a reduced coarse sectional representation of aerosol size distribution in a cloud droplet formation calculation as well as for compensating the uncertainty caused by the aerosol activation parameterization itself. The results show a significant improvement in the accuracy of the simulation compared to the conventional simulation with a reduced model. The proposed approach is rather general and extension of it to different parameterizations or reduced process models that are coupled to geoscientific models is a straightforward task. Another major benefit of this method is that it can be applied to physical processes that are dependent on a large number of variables making them difficult to be parameterized by traditional methods.