Class Imbalance
Class imbalance means outcomes are heavily skewed toward one side (mostly yes or mostly no). It affects baselines, calibration, and score interpretation.
Definition
Class imbalance occurs when one outcome is much more common than the other. For binary events, this means most outcomes are 1 (yes) or most outcomes are 0 (no).
Why it matters
With imbalance, naive forecasts can look strong. Predicting the dominant outcome with a moderate probability can achieve a decent Brier score without real insight.
How to handle it
• Compare against a base rate benchmark.
• Use Brier skill score to measure skill relative to that baseline.
• Segment questions so you do not mix fundamentally different base rates.
Related
Class imbalance is tightly linked to base rate, base rate shift, and calibration diagnostics.