The Plant–Craig stochastic convection parameterization (version 2.0) is implemented in the Met Office Regional Ensemble Prediction System (MOGREPS-R) and is assessed in comparison with the standard convection scheme with a simple stochastic scheme only, from random parameter variation. A set of 34 ensemble forecasts, each with 24 members, is considered, over the month of July 2009. Deterministic and probabilistic measures of the precipitation forecasts are assessed. The Plant–Craig parameterization is found to improve probabilistic forecast measures, particularly the results for lower precipitation thresholds. The impact on deterministic forecasts at the grid scale is neutral, although the Plant–Craig scheme does deliver improvements when forecasts are made over larger areas. The improvements found are greater in conditions of relatively weak synoptic forcing, for which convective precipitation is likely to be less predictable.

Quantitative precipitation forecasting is recognized as one of the most
challenging aspects of numerical weather prediction

One reason for lack of spread in an ensemble is that model variability is
constrained by the number of degrees of freedom in the model, which is
typically much less than that of the real atmosphere. The members of an
ensemble forecast may start with a good representation of the range of
possible initial conditions, but running exactly the same model for each
ensemble member means that the range of possible ways of modelling the
atmosphere – of which the model in question is one – is not fully
considered. Common ways of accounting for model error are running
different models for each ensemble member

Focusing on convective rainfall, and for model grid lengths where convective
rainfall is parameterized, another way of accounting for model error is to
introduce random variability in the convection parameterization itself

Such “stochastic” convection parameterization schemes have been developed
over the last 10 years and are just beginning to be implemented and verified
in operational forecasting set-ups, with some promise for the improvement of
probabilistic ensemble forecasts

The PC scheme has been shown to produce rainfall variability in much better
agreement with cloud-resolving model results than for other non-stochastic schemes

These are encouraging results, albeit from idealized modelling set-ups, and it
is important to establish whether or not they might translate into better
ensemble forecasts in a fully operational NWP set-up.

Although the version of MOGREPS used here has now been superseded, the present
study represents the first time that the scheme has been verified in an
operationally used ensemble forecasting system for an extended verification
period, and it provides the necessary motivation for more extensive tuning and
verification studies in a more current system. As well as this, the present
study aims to reveal more about the behaviour of the scheme itself, building on work referenced above, as well as on recent work by

The paper compares the performance of the PC scheme with the default MOGREPS
convection parameterization, based on

The

The scheme allows for the vertical profile from the dynamical core to be
averaged in horizontal space and/or in time before it is input. This means
that the input profile is more representative of the large-scale (assumed
quasi-equilibrium) environment and is less affected by the stochastic
perturbations locally induced by the scheme at previous time steps. It was
decided in the present study to use different spatial averaging extents over
ocean and over land, in order that orographic effects were not too heavily
smoothed. The spatial averaging strategy implemented was to use a square of

Initial tests showed that the scheme was yielding too small a proportion of
convective precipitation over the domain. Two further parameters were
adjusted from the study by

There is no correct answer for the convective fraction, which is both model-
and resolution-dependent in current operational practice. For example, the
current ECMWF model has a global average of about

The Met Office Global and Regional Ensemble Prediction System has
been developed to produce short-range probabilistic weather forecasts

An outline of the MOGREPS NAE domain, with its rotated latitude–longitude grid. The contours are for reference and are derived from the data set used in the present study to separate the domain into land and ocean areas. The grey shading shows the region for which radar-derived precipitation data were available.

Locations of the North Pole and the corners of the domain of the NAE rotated grid, in terms of real latitude and longitude.

Stochastic physics is already included in the regional MOGREPS, in the form of
a random parameters scheme, where a number of selected parameters are
stochastically perturbed during the forecast run

The forecasts using the Plant–Craig scheme were obtained by rerunning the regional version of MOGREPS, with the standard convection scheme replaced by the Plant–Craig scheme, and driven by initial and boundary conditions taken from the same archived data that were used for the operational forecasts. These are compared with the forecasts produced operationally during the corresponding period, so that the only difference between the two sets of forecasts is in the convection parameterization scheme. The study used the UM at version 7.3. The model time step was 7.5 min, within which the convection scheme was called twice, and the forecast length was 54 h.

The time period investigated was from 10 until 30 July 2009. This length of time was chosen as being sufficient to obtain statistically meaningful results, but without requiring a more lengthy experiment that would only be justified by a more mature system. The particular month was chosen partly for convenience and partly as a period that subjectively had experienced plentiful convective rain over the UK, therefore providing a good test of a convective parameterization scheme.

Experimental forecasts with the Plant–Craig scheme were generated twice
daily (at 06:00 and 18:00 UTC) for comparison with the operational forecast
which was taken from the archive. On some days the archive forecast was
missing and so no experimental forecast was generated. In total 34 forecasts
were generated, with start times shown in Table

Start times of forecasts investigated in this study (all dates in July 2009).

A detailed validation was carried out against Nimrod radar rainfall data

This score (denoted FSS) was developed by

In order to determine whether or not the variability introduced by the
Plant–Craig scheme is added where it is most needed, the Brier skill score

This measure aims to assess the benefit of using an ensemble, as opposed to a
single forecast randomly selected from the ensemble. It was recently
developed and described in detail by

The ensemble added value (EAV) is based on the quantile score (QS)

The QS can, like the Brier score, be decomposed into a reliability and a
resolution component

The reference forecast is created by defining the quantile as simply a randomly selected member of the ensemble, so that the reference forecast represents the score which could have been obtained with only one forecast (a single member is randomly selected, with replacement, once for the entire period but separately for each quantile). The EAV thus measures the quality of the ensemble forecast, relative to the quality of the individual members of the ensemble.

As part of the present study, we extend the work of

The separation into weakly and strongly forced cases was carried out a
posteriori to the event based on surface analysis charts. The aim here is
not to develop an adaptive forecasting system, but rather to develop
understanding of the behaviour of the Plant–Craig scheme. Nonetheless, the
results may also be interpreted as providing evidence that such a system may
be feasible if the strength of the synoptic forcing could be predicted in
advance (using, for example, the convective adjustment timescale as
discussed by

The separation was conducted by assigning periods with discernible cyclonic
and/or frontal activity over or close to the UK as strongly forced and the
rest as weakly forced, with some additional adjustment of the preliminary
categorization based on the written reports by

Categorization of 12 h periods (centred at the time given) investigated in this study into weak and strong synoptic forcing (all dates in July 2009).

The quality of the respective deterministic forecasts (i.e. those produced by
individual ensemble members, with no supplementary indication of the forecast
uncertainty) using GR and PC is assessed
using Figs.

In general, then, the schemes perform similarly overall, and the impact of
using a stochastic scheme on the FSS is modest. Indeed, the fact that there
is no skill for the highest threshold, for either scheme, is more important.
This lack of skill could be simply due to the fact that the case study period
was too short to obtain a statistically significant sample of extreme rain
events. However, it is also true that MOGREPS significantly overforecasts
heavy rain over the UK for this period (see Fig.

Fractions skill score computed for grid-scale data for the Gregory–Rowntree scheme (top), the Plant–Craig scheme (centre), and the difference between the two schemes (Plant–Craig minus Gregory–Rowntree, bottom).

Fractions skill score for the Gregory–Rowntree scheme
(top), the Plant–Craig scheme (centre), and the difference between the two
schemes (Plant–Craig minus Gregory–Rowntree, bottom). The neighbourhood area is

Fractions skill score for the Gregory–Rowntree scheme
(top), the Plant–Craig scheme (centre), and the difference between the two
schemes (Plant–Craig minus Gregory–Rowntree, bottom). The neighbourhood area is

Figure

PC generally performs better than GR for weakly forced cases and worse for
strongly forced cases. While both schemes benefit from upscaling the score,
this benefit is greater for PC. The results agree well with those of

Moreover, it is clear that the upscaling is more beneficial to the PC scheme (relative to the GR scheme) for the weakly forced cases than for the strongly forced cases. The interpretation is that the PC scheme provides a better statistical description of small-scale, weakly forced convection than a non-stochastic scheme. This will not provide any improvement to the FSS evaluated at the grid scale, since the convection is placed randomly, but it does improve the FSS when it is evaluated over a neighbourhood of grid points, so that it becomes a more statistical evaluation of the quality of the scheme.

Fractions skill score for the Plant–Craig scheme, minus that for the Gregory–Rowntree scheme, for strongly forced cases (full lines) and weakly forced cases (dashed lines), with no averaging (top), with a neighbourhood area of two grid boxes in each direction (centre), and with a neighbourhood area of four grid boxes in each direction (bottom). The score shown is the average over all lead times.

The quality of the probabilistic forecasts, with respect to forecasts using
the observed climatology, is assessed using Brier skill scores, plotted in
Fig.

Brier skill score for the Gregory–Rowntree scheme (top), the Plant–Craig scheme (centre), and the difference between the two schemes (Plant–Craig minus Gregory–Rowntree, bottom). For the difference plot, instances where both skill scores are lower than zero are not plotted.

The decomposition of the Brier score into reliability (Fig.

Brier score reliability for the Gregory–Rowntree scheme (top), the Plant–Craig scheme (centre), and the difference between the two schemes (Gregory–Rowntree minus Plant–Craig, bottom).

Brier score resolution for the Gregory–Rowntree scheme (top), the Plant–Craig scheme (centre), and the difference between the two schemes (Plant–Craig minus Gregory–Rowntree, bottom).

Figure

Brier skill score for the Gregory–Rowntree scheme (green lines) and the Plant–Craig scheme (red lines), averaged over all lead times, for cases with strong forcing (full lines) and weak forcing (dashed lines), as a function of threshold. The reference for the skill score is the observed climatology. The axes have been chosen to focus on where the skill score is above zero.

The EAV is plotted in Fig.

Note that the ensemble forecasts using the GR scheme also have a positive EAV, representing the value added by the multiple initial and boundary conditions provided by the global model, and by the stochasticity coming from the random parameters scheme. Since these factors are also present in the ensemble forecasts using the PC scheme, it can be interpreted that the fractional difference between the two EAVs represents the value added by the stochastic character of the PC scheme as a fraction of the value added by all the ensemble generation techniques in MOGREPS.

Ensemble added value (EAV) for the Gregory–Rowntree scheme (green line) and the Plant–Craig scheme (red line) as a function of forecast lead time.

Although Nimrod radar observations were only available over a restricted part
of the forecast domain, it is also of interest to compare the forecasts over
the whole domain. Figure

As discussed in Sect.

The ensemble spread is shown as a function of lead time in Fig.

Figure

Figure

Although a lead time of 30 to 36 h was chosen for Figs.

Convective fraction as a function of forecast lead time, for the Gregory–Rowntree scheme (green lines) and the Plant–Craig scheme (red lines), over land (dashed lines), over ocean (dotted lines), and in total (full lines), for the full NAE domain.

Ensemble spread as a function of forecast lead time, for the Gregory–Rowntree scheme (green lines) and the Plant–Craig scheme (red lines), over land (dashed lines), over ocean (dotted lines), and in total (full lines), for the full NAE domain.

Density plots for accumulated rainfall for the period of 30 to 36 h lead time, over the UK part of the domain, for forecasts with the Gregory–Rowntree scheme (green line), the Plant–Craig scheme (red line), and observations (black line).

A validation using the routine verification system was also performed for the
two set-ups, covering land areas over the whole forecast domain. This
calculates various forecast skill scores, by comparing against SYNOP
observations at the surface and at a height of 850 hPa, and yielded a mixed
assessment of the performance of the PC scheme against the GR scheme. For
example, the continuous ranked probability score, which assesses both the
forecast error and how well the ensemble spread predicts the error

Density plots for accumulated rainfall for the period of 30 to 36 h lead time, over the entire NAE domain, for forecasts with the Gregory–Rowntree scheme (green line) and the Plant–Craig scheme (red line) over ocean.

A physically based stochastic scheme for the parameterization of deep convection has been evaluated by comparing probabilistic rainfall forecasts produced using the scheme in an operational ensemble system with those from the same ensemble system with its standard deep convection parameterization. The impact of using a stochastic scheme on deterministic forecasts is broadly neutral, although there is some improvement when larger areas are assessed. This is relevant to applications such as hydrology, where rainfall over an area larger than a grid box can be more relevant than rainfall on the grid box scale.

The Plant–Craig scheme has been shown to have a positive impact on
probabilistic forecasts for light and medium rainfall, while neither scheme
is able to skillfully forecast heavy rainfall. The impact of the scheme is
greater for weakly forced cases, where subgrid-scale variability is more
important.

Although the Plant–Craig scheme clearly produces improved probabilistic
forecasts, it is not certain whether this is due to its stochasticity, due to
different underlying assumptions between it and the standard convection
scheme, or simply due to the decrease in convective fraction seen in this
implementation. In order to make a clean distinction, further studies could
be performed in which the performance of the Plant–Craig scheme is compared
against its own non-stochastic counterpart, which can be constructed by using
the full cloud distribution and appropriately normalizing, instead of
sampling randomly from it

The results of this study justify further work to investigate the impact of the Plant–Craig scheme on ensemble forecasts. Since the version of MOGREPS used in this study has been superseded, it is not feasible to carry out a more detailed investigation beyond the proof of concept carried out in the present study. Interestingly, the resolution used in this study is now becoming more widely used in global ensemble forecasting, and so future work could involve implementing the scheme in a global NWP system, for example the global version of MOGREPS. This would enable assessments to be made as to whether the scheme provides benefits for the representation of tropical convection, in addition to those aspects of mid-latitude convection that were demonstrated here.

The source code for the Plant–Craig parameterization, as it was used in this study, can be made available on request, by contacting r.s.plant@reading.ac.uk.

We would like to thank Neill Bowler for helping to plan and set up the numerical experiments, and Rod Smyth for helping to set up preliminary experiments on MONSOON. We thank the two anonymous reviewers for comments and suggestions, which have greatly improved and clarified the manuscript.Edited by: H. Tost