Methodology

How the model works

The core

A time-weighted Dixon-Coles Poisson model. Every international result is down-weighted by age, then fit to give each nation an attack and a defence strength plus a global home-field term. Those strengths turn any fixture into a full distribution over scorelines — not just a winner, but the probability of every 2–1, 0–0, and so on.

The gate decides

A covariate only enters the model after it beats climatology on held-out Ranked Probability Score, backtested on the 2018 and 2022 World Cups. Ingesting data is always fine; modeling with it is earned. Two filters decide eligibility: the feature must be point-in-time (snapshotted before each backtest tournament) and must carry signal beyond the rating — anything already priced in by the time-decay (form, rest) dies here.

What survived and ships live: a host edge, a squad-talent term (the “France blind spot” — a roster more talented than recent results show), and travel fatigue over the 2026 venue map. Altitude, form and rest were tried and dropped because they failed the gate.

Why not heat?

A 2026 cup across Texas, the Gulf coast and Mexico invites a heat term, so we tested one — twice. A symmetric version (hot venue suppresses both teams’ scoring) and a differential-acclimatization version (teams from hot climates suffer less) were both backtested on the two hottest cups on record, USA ’94 and Brazil ’14. Both failed the gate: the symmetric term moved the score by essentially zero and carried the wrong sign, and the differential term lost outright.

The sports-science literature explains why. Heat reliably degrades physical output — total distance, sprint count — but players pace themselves to keep the things that decide matches (passing accuracy, peak speed, goals) roughly flat. A systematic review of 21 real-match studies finds environmental factors hit physical performance far more than technical performance (Illmer & Daumann, 2022); a dedicated weather-and-technical-actions study reaches the same conclusion, noting both teams share identical conditions (Zhong et al., 2024). Heat changes how a match is played, not who wins — so it ships as a display-only context badge, not a model input.

From a match to a tournament

The scoreline engine feeds a Monte-Carlo simulation of the full 48-team bracket — 20,000 tournaments. Group games that have already been played are pinned to their real results rather than re-simulated, so the title and advancement odds sharpen as the cup unfolds. The numbers on the board are simply the share of those simulations in which a team reaches each stage.

The scoreline pick

The per-fixture PICK is not always the single most-likely score. It maximises expected points under 4/3/2 prediction-pool scoring (exact score / goal difference / tendency), so it leans toward central scores that bank partial credit across tiers. When it diverges from the modal score, both are shown.

Refresh

A daily job re-ingests the international-results feed, re-fits the ratings, pins newly played group games, and regenerates these snapshots. Real results dominate any covariate, so this loop is the single largest accuracy gain between now and the final.

Generated 2026-06-03. Features: host ×0.33 + squad talent (FC26) + 2026 travel fatigue. This is a model, not betting advice.