2015 World Cup

The World Cup is a 128 player knockout tournament in September of 2015. It is a component of the World Championship Cycle, as the top two finishers will earn spots in the eight player 2016 Candidates Tournament, where the winner will go on to face Magnus Carlsen in the 2016 World Championship match. Each round consists of two classical games, and in the case of a tie two rapid games, and if still tied two faster rapid games, and if still tied two blitz games, and if still tied one final Armageddon game to determine which player advances to the next round. Additional factual details on the event, structure, and field can be found in the excellent Wikipedia article on the event. At this time, my player/seed/rating list comes from that page.

The complete field of 128 qualifiers became official on August 14th, so we can begin to project the results, and estimate various players’ chances of reaching the finals, or of winning the entire match. These projections are based on the players’ ratings on the August rating list, and assume that all 128 players will compete and be seeded in this order. Of course that will shift before the event actually begins; our final projections will use live ratings, final seed order will be determined by the as-yet unpublished September rating list, and in a field of this size it’s relatively likely that a few players may end up unable to make it to the event and replaced by alternates. Nevertheless, enough information is now available that we can run it through our model and give you some early predictions! For details on the methodology, scroll to the bottom, below all the listed odds.

9/29/2015 Update:

We have reached the finals, and all our previously published odds on who will reach this last round are obsolete. The finalists are Sergey Karjakin and Peter Svidler. Based on his 23 point edge in the live ratings, we estimate that in the 4-game classical match (new format for the final round), Karjakin should win the tournament 59.9% of the time.

Methodology:

These odds are entirely probabilistic. Unlike other tournaments, where too many possible results exist, and we are forced to result to Monte Carlo simulations to estimate the odds, the single elimination structure of this event allows us to calculate odds directly that are as precise as our underlying assumptions can allow. We developed a table that looks at the rating differential between the two players, and estimates each player’s odds of winning the mini-match and advances to the next round. In later rounds for any given player we look at their odds of reaching that round in the first place, all the possible opponents they could face if they get there, the relative odds of facing each of those opponents, and the odds they would defeat each of those opponents in a potential matchup. Ultimately, this means that if our underlying estimate of match odds were completely accurate (of course it isn’t, and can’t be) that all other extrapolated percentages would be perfectly accurate probabilities as well. We don’t have the “not enough simulations” error source that most of our other predictions have to avoid.

What match odds do we assume? And why aren’t they “completely accurate”? Well, our odds are based exclusively on players’ standard ratings. Even though FIDE publishes rapid and blitz ratings, and those time controls will come into play frequently throughout the tournament, we ignore them. This is by design; although some players have played a lot of rated games at the faster time controls, many others have not. In the long run, throughout the entire field, we suspect that standard ratings (which almost always have a strong sample size for active players, and correlate well with “chess ability”, which applies at any time control) probably have more predictive value than the often higher variance rapid and blitz ratings. In particular, rapid ratings are suspect. Rated blitz tournaments (both on their own, and as preliminary events before classical tournaments) have gotten popular, and many of the blitz ratings out there do have a good sample size, however most games at this time remain either blitz or classical. Rated rapid games remain few and far between. Since two pairs of rapid games are played before blitz games would occur, in the World Cup tie break procedures, blitz ratings may be accurate but have minimal impact on predictions. Rapid ratings are more important for predictions, but less trustworthy due to the small samples.

That being said, choosing to ignore the rapid and blitz ratings does mean we’re intentionally ignoring some meaningful data. We really can say with confidence, for example, that Fabiano Caruana is a weaker blitz player than his standard rating would suggest. In a perfect world, we would perform a Bayesian analysis and estimate the rapid and blitz playing strengths of every player in the field with a weighted average of their rapid/blitz ratings and their standard ratings, where we determine the weights based on the sample sizes (players with lots of rated blitz games under their belt would pretty much just use their blitz rating, players with few or none would have most/all of the weight applied to their standard rating). We don’t live in a perfect world, though, and it would be a major time consuming project to build a model in that structure that actually had merit. So instead we’re just using standard ratings only, and basing everything off of rating differential. It’s probably close enough.

Other sources of error in the analysis include our estimation of draw rates. We assume the draw rate in any given game is always the same, provided the time control and rating differential is the same. While we do of course assume reduced draw rates as the rating differential increases, we don’t adjust for specific players’ tendencies or for match conditions. We also assume (for computational simplicity) that every game is fully independent. In reality, the odds in game two almost certainly vary based on the results of game one: if game one was decisive, and the winner can “play for a draw” to advance, while the loser must “play for a win” to stay alive, then logically the odds of various results are very likely different than they would be if both players “just played normally”. Exactly how the odds shift (are draws more likely? less likely? is the player “pressing for a win” going to lose more often than normal?) is complex, and again would require a detailed study to accurately model. Instead, we’re just choosing to go with the simplistic assumption, and accept that our probabilities aren’t “perfect”. Here is a graph of the odds we’re using:

CupMatchOdds

And here they are in a data table:

ELO Difference Odds of Advancing
650 99.91%
600 99.90%
550 99.90%
500 99.86%
450 99.7%
400 99.4%
350 98.6%
300 97.4%
275 96.4%
250 95.3%
225 93.8%
200 91.8%
180 89.9%
160 87.8%
140 85.2%
120 82.1%
100 78.5%
90 76.6%
80 74.5%
70 72.2%
60 69.8%
50 67.1%
45 65.7%
40 64.2%
35 62.5%
30 60.8%
25 59.0%
20 57.2%
19 56.9%
18 56.5%
17 56.2%
16 55.8%
15 55.4%
14 55.1%
13 54.7%
12 54.4%
11 54.0%
10 53.6%
9 53.3%
8 52.9%
7 52.5%
6 52.2%
5 51.8%
4 51.5%
3 51.1%
2 50.7%
1 50.4%
0 50.0%

10 thoughts on “2015 World Cup

  1. Interesting stuff as always! Looks like Jakovenko has pretty good odds for a spot in the Candidates, given that he can add Nakamura’s and Caruana’s chances to his own. Also, if either Topalov or Giri reaches the final, that would vacate a rating spot in the Candidates for someone else (Grischuk/Kramnik/So/Aronian).

    Like

    • Yeah, after the Sinquefield Cup I’ll drill down in more detail on what these numbers really mean in context of the people who are both favorites at the World Cup, but also already qualified in other ways.

      Like

  2. Why does Kramnik have a recorded probability of 11.0% to reach finals while Ding only has 10.6%, despite Ding being higher rated?

    Like

    • It’s all Levon Aronian’s fault. He went and won the Sinquefield Cup, and gained 19 rating points, after the World Cup pairings were determined, and he’s in Ding’s half of the bracket.

      If Ding reaches the third round, there is a 59% chance he’ll face Aronian (in an 8/9 matchup by seeding), and so Ding’s odds of making it to the fourth round are lower than Kramnik’s despite the higher rating. That carries through all the way to the end.

      If Aronian were still rated 2765 (keeping everything else the same) then Ding would have an 11.5% chance to reach the finals, better than Kramnik’s 11.0%, and exactly what you expect to see for a slightly higher rated player.

      With Aronian at 2784.1 though, Ding’s odds of reaching the final drop to 10.5%.

      Liked by 1 person

  3. Question: how are you getting the live ratings of players not in the Top 100, such as Guseinov, Lu, and Kovalyov? Those aren’t displayed by 2700chess, after all…

    Like

Leave a comment