← Back to Glossary

Out-of-sample

Out-of-sample evaluation measures performance on data that was not used to form or tune the forecasts. It helps detect overfitting and ensures results generalize.

Definition

Out of sample evaluation means scoring forecasts on events that were not used to design, tune, or select the forecasting approach. It is the opposite of in sample evaluation.

Why it matters

It is easy to look good on past data by overfitting. Out of sample testing checks whether your method generalizes. For forecasters, it also reduces the risk of hidden selection bias.

Practical approaches

• Holdout: keep a portion of questions unseen until scoring.

• Time split: train or calibrate on earlier periods, score on later periods.

• Rolling windows: repeatedly evaluate on forward periods.

Related

Out of sample evaluation is important when comparing to a benchmark or to market consensus, and when reporting Brier skill score.