Submitted as: methods for assessment of models
11 Oct 2018
Submitted as: methods for assessment of models |  | 11 Oct 2018
Status: this preprint was under review for the journal GMD but the revision was not accepted.

Common metrics of calibration for continuous Gaussian data and exceedance probabilities

Rita Glowienka-Hense, Andreas Hense, Thomas Spangehl, and Marc Schröder

Abstract. A framework of ensemble forecast verification tools is discussed which is founded on the concept of information entropy. It can be based on a common yardstick namely that of "correlation". With these measures calibration is deduced from the balance between ensemble sharpness and resolution. With the same units these features can be put into one diagram for continuous time series from Gaussian processes and exceedance probabilities, the latter usually tested with the reliability term from the Brier score. The sharpness and resolution terms allow to use the same vocabulary of over- and underdispersion which is established for frequency histograms. The concept is based on the fact that mutual information (MI) of two Gaussian processes is directly related to Pearson's anomaly correlation. Further MI can be written as the Kullback-Leibler divergence of the conditional probability of observations given the model forecasts and the unconditioned observations. Thus the MI is a measure of resolution. The mean of the UTILITY defined by (Kleeman, 2002) is the corresponding measure of sharpness. For Gaussian processes the mean UTILITY is very close to the ratio of ensemble mean variance to mean ensemble variance (ANOVA) which is the analysis of variance factor when time is taken as treatment. The ensemble spread score (ESS) (Palmer et al., 2006) is shown to be a measure of calibration if model and observed data are scaled with their respective means and standard deviations. For exceedance probabilities the resolution term of the divergence score (Weijs et al., 2010) is already defined as a MI term and it is here complemented with a mean UTILITY formed similarly to the resolution term but with forecasts only. The entropy terms are then rescaled to the "correlation" yardstick. The concept is applied to temperature data from the German project on decadal climate prediction, Mittelfristige Klimaprognose (MiKlip). It is shown that both over – and underdispersion can be found for the 2m temperature forecasts. Increasing ensemble sharpness of surface ocean temperature with lead year in the southern ocean hints at model-data inconsistencies at some locations in the ocean. Finally empirical orthogonal functions (EOF) of northern hemisphere annual mean surface temperature for ERA-40/ERA-Interim and MiKlip retrospective hindcasts are determined. For both data sets the respective first EOF represents the low frequency temperature development. The time coefficients of the EOF are used to compare resolution and sharpness of continuous data and exceedance probabilities in one diagram.

Rita Glowienka-Hense et al.

Status: closed
Status: closed
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Printer-friendly Version - Printer-friendly version Supplement - Supplement
Status: closed
Status: closed
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Printer-friendly Version - Printer-friendly version Supplement - Supplement

Rita Glowienka-Hense et al.

Rita Glowienka-Hense et al.


Total article views: 1,512 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
980 458 74 1,512 56 51
  • HTML: 980
  • PDF: 458
  • XML: 74
  • Total: 1,512
  • BibTeX: 56
  • EndNote: 51
Views and downloads (calculated since 11 Oct 2018)
Cumulative views and downloads (calculated since 11 Oct 2018)

Viewed (geographical distribution)

Total article views: 1,343 (including HTML, PDF, and XML) Thereof 1,342 with geography defined and 1 with unknown origin.
Country # Views %
  • 1
Latest update: 08 Dec 2023
Short summary
Ensemble forecast verification treats the issues of forecast errors and uncertainty estimated from ensemble spread. We suggest measures based on relative entropy. For continuous variables correlation and the mean ratio of the ensemble spread to climate variance (analysis of variance (anova)) are related to these entropies. For categorical data corresponding scores are deduced that allow the comparison with continuous data.