Ensemble-size-dependence of deep-learning post-processing methods that minimize an (un)fair score: motivating examples and a proof-of-concept solution

•

Original Author:Christopher David Roberts

•

February 17, 2026

Ensemble-size-dependence of deep-learning post-processing methods that minimize an (un)fair score: motivating examples and a proof-of-concept solution

Image generated by Gemini AI

The article discusses the challenges of using adjusted continuous ranked probability score (aCRPS) for training ensemble forecasts, particularly when structural dependencies between members are introduced. It highlights two problematic approaches: linear member calibration and a deep-learning method that can create over-dispersion issues. The authors propose "trajectory transformers," adapting the PoET framework to maintain conditional independence in forecasts. This method effectively reduces systematic biases and improves reliability in weekly mean temperature forecasts from the ECMWF system, regardless of ensemble size (3 vs. 9 members in training; 9 vs. 100 in real-time).

New Research Highlights Ensemble-Size Dependence in Deep Learning Post-Processing Methods

Recent findings reveal that the effectiveness of certain deep-learning post-processing methods in ensemble forecasting can be significantly influenced by the size of the ensemble. The study focuses on fair scores, particularly the adjusted continuous ranked probability score (aCRPS), designed to evaluate ensemble forecasts without bias concerning ensemble size.

The research investigates two approaches aimed at minimizing the expected aCRPS for finite ensembles:

Linear Member-by-Member Calibration: This method couples ensemble members through a shared dependency on the sample ensemble mean.
Deep Learning with Transformer Self-Attention: This technique links ensemble members using self-attention mechanisms across the ensemble dimension.

Both methods demonstrated sensitivity to ensemble size, indicating that improvements in aCRPS could be misleading, often accompanied by systematic unreliability and over-dispersion in forecasts.

Trajectory Transformers as a Solution

To address these issues, the researchers introduced trajectory transformers, a proof-of-concept adaptation of the Post-processing Ensembles with Transformers (PoET) framework. This approach employs self-attention across lead times while maintaining the necessary conditional independence for aCRPS assessments.

Applied to weekly mean 2-meter temperature forecasts from the ECMWF subseasonal forecasting system, trajectory transformers effectively reduced systematic model biases and improved forecast reliability, regardless of the ensemble size utilized in training.

Share this article

Twitter Facebook LinkedIn WhatsApp Reddit

Ensemble-size-dependence of deep-learning post-processing methods that minimize an (un)fair score: motivating examples and a proof-of-concept solution

New Research Highlights Ensemble-Size Dependence in Deep Learning Post-Processing Methods

Trajectory Transformers as a Solution

Related Topics:

Share this article