April 21, 2026 · 26 pages · 511 KB
Moneyball for LLMs
Behavioral fingerprinting of frontier AI models in football talent evaluation and event prediction
Four frontier AI models tested across 12 talent dimensions and ~45,000 trials. Documents League Prestige Discount (Cohen’s h = 1.18–1.41) — unanimous across all models — and demographic evaluation inconsistency with EU AI Act compliance implications.
April 27, 2026 · 18 pages · 353 KB
How can you improve the predictive power of LLMs in sports?
Two mechanisms for improving LLM football match predictions
979 matches across 18 leagues. A three-bias formula predicts model accuracy with r = 0.997 before any prediction is made. Bias-derived calibration improves Brier score by 4.6–7.3% per model.