In this study, we assess the skill of a stochastic weather generator (SWG) to forecast precipitation in several cities in western Europe. The SWG is based on a random sampling of analogs of the geopotential height at 500 hPa (

Results show significant positive skill score (continuous rank probability skill score and correlation) compared with persistence and climatology forecasts for lead times of 5 and 10 d for different areas in Europe. We find that the low predictability episodes of our model are related to specific weather regimes, depending on the European region. Comparing the SWG forecasts to ECMWF forecasts, we find that the SWG shows a good performance for 5 d. This performance varies from one region to another. This paper is a proof of concept for a stochastic regional ensemble precipitation forecast. Its parameters (e.g., region for analogs) must be tuned for each region in order to optimize its performance.

Ensemble weather forecasts were designed to overcome the issues of meteorological chaos, from which small uncertainties in initial conditions can lead to a wide range of possible trajectories

Forecasts issued by meteorological centers are obtained by computing several simulations with perturbed initial conditions, in order to sample uncertainties. Those experiments are rather costly in terms of computing resources and are generally limited to a few tens of members

From a mathematical point of view, computing the probability distribution of the trajectories of a (deterministic) system makes the underlying assumption that the system behaves like a stochastic process, for which statistical properties are defined naturally

The first stochastic weather generators were devised to simulate rainfall occurrence

Stochastic weather generators can be useful complements to atmospheric circulation models, in order to simulate large ensembles of local variables,
as they can be calibrated for small spatial scales compared with numerical models

A successful simulation with a SWG relies on the choice of inputs. The atmospheric circulation can be chosen a predictor for other local variables. The (loose) rationale for this choice is that the circulation is modeled by prognostic equations

Analogs of circulation were initially designed to provide “model-free” forecasts by assuming that similar situations in atmospheric circulation may lead to similar local weather conditions

Alternative systems of analogs to forecast precipitation have been proposed by

We will evaluate the seasonal dependence of the forecast skills of precipitation and the conditional dependence on weather regimes. Finally, comparisons with medium-range precipitation forecasts from the ECMWF will be performed.

The paper is divided as follows: Section

Daily precipitation data were obtained from the European Climate Assessment and Data (ECAD) project

We recovered the geopotential height at 500 hPa (

We also used the atmospheric reanalysis (version 5) of the European Centre for Medium-Range Weather Forecasts (ECMWF) (ERA5;

We considered the daily averages of

In order to assess the predictive skill of our precipitation forecast model, a comparison with another forecast was made. Many available datasets can be used for deriving this information. We considered the ECMWF ensemble forecast dataset system 5

The first step is to build a database of analogs of the atmospheric circulation. We outline the procedure of

For each day

As a refinement over the study of

We consider different geographic domains as shown in Fig.

Parameters of the analog computation.

We use a stochastic weather generator (SWG) based on a random sampling of the circulation analogs. The operation of the SWG and its design are detailed by

The procedure presented above is repeated

Then

We made a hindcast exercise, where the forecasts of precipitations based on analogs of atmospheric circulation (

The simulations of this stochastic model will be called “SWG forecasts”, as opposed to ECMWF forecasts.

Forecast verification is the process of determining the statistical quality of forecasts. A wide variety of ensemble forecast verification procedures exists

Weather regimes over Europe from SLP fields. Upper panels

The temporal rank correlation (referred to as correlation skill) is calculated between the precipitation observations and the median of 100 simulations. This simple diagnostic is often used to assess forecast skills of indices

The continuous ranked probability score (CRPS) is widely used for probabilistic forecast verification

If the ensemble forecast

The CRPS is computed as

One way of circumventing this difficulty is to compare CRPS values to reference forecasts, such as persistence or climatology. The continuous rank probability skill score (CRPSS) is a normalization of Eq. (

The CRPSS is hence computed by

The values of the CRPSS vary between

We use the CRPSS values to determine the maximum lead time

We investigated the role of North Atlantic weather patterns on the forecast quality by attributing CRPS values of the SWG precipitation simulations to weather regimes.
Weather regimes are defined as large-scale quasi-stationary atmospheric states. They are characterized by their recurrence, persistence, and stationarity

Time series of analog ensemble forecasts for 2002, for lead times of 5 d

The North Atlantic weather regimes were computed with the procedure of

The winter weather regimes are the negative phase of the North Atlantic oscillation (NAO

For each day (in winter and summer) between 1948 and 2019, we classified the SLP by minimizing the root mean square to four reference (1970–2010) weather regimes.

Skill scores for the precipitation of Orly, Madrid, Berlin, and Toulouse for lead times of 5, 10, 20 d for January (blue) and July (red) for analogs computed from reanalyses of NCEP. Squares indicate CRPSS where the persistence is the baseline, triangles indicate CRPSS where the climatology is the reference, and boxplots indicate the probability distribution of correlation between observation and the median of 100 simulations for all days.
The boxplot upper whisker is:

For each day

We evaluated the influence of the dominating weather regimes on the SWG forecast quality by plotting the probability distribution of CRPS values

We identified two classes of predictability from CRPS values:

Low predictability is related to high values of CRPS that exceed the 75th quantile.

High predictability is linked to low values of CRPS, below the 25th quantile.

We started by verifying the relationship between

Then we empirically adjusted the parameters of the SWG simulations to optimize the forecast scores. The first parameter is the geographic area. We computed sample trajectories of the SWG for the four domains outlined in Fig.

Correlation between observations and the median of 100 simulations for the winter (DJF) for the different studied domains represented in Fig.

The second parameter is the number

CRPSS versus persistence and climatology for SWG simulations with 5, 10, and 20 analogs for the [30

We quantified the dependence of the forecast on the time embedding for the analogs

Correlation between observations and the median of 100 simulations for the winter (DJF) based on analogs computed with an embedding of 1 and 4 d for the geographic domain with the coordinates [30

For comparison purposes, SWG simulations are obtained using analogs computed from reanalyses on the NCEP and ERA5 reanalyses.
By comparing their skill scores, we found that CRPSS and correlations between observations and simulations are positive in both cases, and show positive improvement compared with persistence and climatology forecasts. The CRPSS and correlation for simulations with analogs of NCEP are almost identical to those with ERA5, as shown in Table

Comparison between the values of the CRPSS of SWG computed using different reanalysis datasets for NCEP and ERA5 from 1979 to 2019 for a lead time of

In summary, we made the forecast of the precipitation using

As an example, we illustrate the behavior of the trajectories in Orly for the summer and winter of 2002. Figure

Correlation between observations and the median of 100 simulations for both seasons, winter (DJF) and summer (JJA), for a lead time of 5 d.

We observe that the 5th and 95th quantiles of the simulations include the different values of observations. This heuristically confirms the good skill of SWG to forecast precipitation from

The difference in the forecast correlation skills between the four studied locations may be related to the variation of the local climate from one region to another. The studied areas are in different climate types according to the Köppen–Geiger climate classification

In this paper, we decided (for simplicity) to use the same analogs to forecast precipitation for those four stations as discussed in Sect.

The CRPSS and correlation skill scores are computed for the four studied stations (Berlin, Madrid, Orly, and Toulouse), as shown in Fig.

Percentage of each weather regime for observations dates (Obs) and the most frequent weather regime from SWG simulations between

In this paper, we chose to present the results for summer and winter to highlight the capacity of the SWG to forecast the precipitation in extreme seasons. We focus on January and July in order to show the skill of the SWG in predicting precipitation in different conditions.

The CRPSS against the persistence and climatology references show positive values for lead times of up to 20 d (Fig.

This means that a persistence forecast for Orly is likely to be skillful, even for longer lead times, especially in the summer. Therefore, the trends in CRPSS values for different lead times are probably due to the intrinsic time persistence of local precipitation.

The CRPSS against the climatology reference (triangles in Fig.

The correlation skill is positive for both seasons but higher in winter (January) than in summer (July). For a lead time of 5 d, the correlation is equal to 0.59 for Madrid, 0.50 for Berlin, and 0.40 for Toulouse. For a lead time of 10 d, it is equal to 0.42 for Madrid, 0.30 for Berlin, and 0.41 for Toulouse.

The SWG was tested by

From a visual inspection of the CRPSS and correlations, we chose to focus on lead times of

We investigated the role of North Atlantic weather patterns defined in Sect.

We started by comparing the frequencies of the weather regimes from the observations and the most frequent weather regime found in SWG simulations for a given lead time

Then we looked at the relation between weather regimes and CRPS values by using the most frequent weather regime within

Relation between CRPS and weather regimes for Orly, for SWG forecasts with lead time

The weather regime signal for “good” forecasts depends on the season and the considered station.
When the forecast has a low CRPS value (for Orly), we find that the Scandinavian blocking regime slightly dominates (green bar in Fig.

The weather regime signal for “poor” forecasts also yields a dependence on the season and station.
Higher CRPS values are obtained with the Zonal regime in the summer for Orly (red line in Fig.

This relation between predictability (or the CRPS distribution) and weather regimes, albeit weak, is consistent with previous work of

We first compared the CRPSS of SWG forecasts for winter and summer with the CRPSS of ECMWF forecasts.

The CRPSS of the ECMWF forecast is
computed for different lead times going from 1 to 10 d for precipitation

We made a quantitative comparison between the two forecasts for the different lead times. We computed the CRPS for the ECMWF forecast. Then we used the Kolmogorov–Smirnov (KS) test (

Empirical cumulative distribution function of the CRPS of ECMWF (blue) and SWG (red) forecasts for 5 d, for Orly

CRPSS of ECMWF forecasts using as a reference the CRPS of SWG, for lead times of

We found that 80 %, 39 %, 50 %, and 40 % of the CRPS of SWG forecast are equal to zero for, respectively, Orly, Berlin, Madrid, and Toulouse, for a lead time of

This new ECMWF CRPSS evaluates the added value of the ECMWF forecast over the SWG forecast. We found that the ECMWF forecast has no improvement over the SWG forecast for the different lead times because the CRPSS values are negative. At

This confirms the relatively good skill of the SWG to forecast precipitation, compared with ECMWF. This could be explained by the difference in the average of the CRPS of the two forecasts. Indeed, as we mentioned before, the ECMWF forecast yields the best skill scores for small values of precipitations (

In this work, we have shown the performance of a stochastic weather generator (SWG) to simulate precipitation over different locations in western Europe and for various time scales from 5 to 20 d. The input of our model was analogs of geopotential heights at 500 hPa (

This study of precipitation forecast complements the work of

We evaluated the relation between the quality of the forecast and weather regimes over Europe. We found that low and high predictability were related to specific weather regimes. This dependence is more significant in winter than in summer. We found that good predictability is mainly related to blocking.

A comparison with the ECMWF forecast system over western Europe confirmed quantitatively and qualitatively the skill forecast of the SWG , for lead times of

This paper hence confirms the proof of concept to generate ensembles of (local) precipitation forecasts from analogs of circulation. The SWG ensemble forecast performance relies on the relation between precipitation and the synoptic atmospheric circulation, which is verified for western Europe. Transposing this SWG to other regions of the globe requires observations covering several decades. Numerical weather models obviously do not yield this constraint.

To avoid a tedious redundancy we deferred the figures of evaluation of the forecast quality by weather regimes to this appendix section.

Relation between CRPS and weather regimes for Berlin

In order to justify the use of the

Maps of correlation between

We explain further the comparison that we made between the ECMWF forecast and the SWG forecast. As mentioned we found that the SWG has improved compared with the ECMWF forecast. This is related to the difference in the time average of the CRPS of the two forecasts.
We computed the CRPSS as follows:

Average and median values of CRPS, average CRPSS (in bold) of the ECMWF and SWG forecasts for lead times of

Boxplots of CRPS of ECMWF and CRPS of SWG for Orly, with lead time

The code and data files are available at

MK performed the analyses. PY co-designed the analyses. CD and ST participated in the manuscript preparation.

The contact author has declared that neither they nor their co-authors have any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is part of the EU International Training Network (ITN) Climate Advanced Forecasting of subseasonal Extremes (CAFE). We thank Linus Magnusson and Florian Pappenberger for helpful discussions on the ECMWF data.

This work is part of the EU International Training Network (ITN) “Climate Advanced Forecasting of subseasonal Extremes” (CAFE). The project receives funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant (agreement no. 813844).

This paper was edited by Chiel van Heerwaarden and reviewed by two anonymous referees.