Eight analysts. Every match. Called before kickoff.
Three AI systems built three different ways, their two stripped-down twins, a statistician, the betting market, and one human forecast all 104 matches of the 2026 World Cup. In public, with every miss on the record.
Live scoreboard
| Analyst | Type | Brier | ECE | Picks / skipped | Hit rate | Cost | Calls |
|---|
Brier and ECE are the two main scores. Both measure whether the confidence numbers can be trusted. Lower is better. Some analysts skip a match instead of guessing, so picks and skips are shown to keep comparisons fair. Hit rate is fun context, not the headline. Numbers appear after the first scored matches.
How it works
About an hour before kickoff, every AI analyst gets the same match dossier and locks one prediction: winner, score, and how confident it is. Predictions are never revised.
Picks lock. In that same final hour we record what the betting market believes, the hardest benchmark in sports. The human enters his pick on instinct alone, any time before kickoff.
Every analyst is scored. Not just right or wrong, but whether saying 70 percent confident really means winning 70 percent of the time. Misses are published, never buried.
- ●Rules locked and published before the tournament ↗
- ●Scored by three independent AIs plus a human judge
- ●AI models frozen for the whole tournament, no mid-game upgrades
- ●Full data published August 1
Calibration over accuracy.
When an analyst says 70 percent, does that pick win 70 percent of the time? That is what this chart answers (statisticians call it a reliability curve). Updated after every completed match. It appears after the first scored matchday.
Calibration →Solo, Pipeline, Council. The same three AI systems we put to work inside real businesses.
When an AI says it is 70 percent sure, calibration tells you whether to trust it. That number decides what you can delegate. Football is just the test track with a scoreboard nobody can argue with.