Simulating The Grand Prix – Methodology

On our main page for the 2014-15 Grand Prix, you will notice that we list each player’s odds of finishing in the top two of the final Grand Prix standings – an important mark because those top two players earn berths in the 2016 Candidates Match, with a chance at the World Championship. If you’ve seen this, perhaps you have wondered: where did those numbers come from? How are they calculated, and how accurate are they?

So here I will discuss our methodology. First of all, the overview is relatively simple. I’ll go over that portion first, and dive deeper into the details (that may less interesting to some readers) afterward. We have built a spreadsheet that estimates the odds of a white win, a black win, or a draw, for all 132 Grand Prix games left to play (66 in Tbilisi, 66 in Khanty-Mansiysk). Further, it can use those odds to randomly calculate a result in each game, AND correctly calculate how the Grand Prix points would be awarded given those results, and who would therefore be the top two finishers.

Our simulation simply re-runs the randomizer a large number of times, recording the top two finishers each time, and spits out a result for each player: what percentage of the overall simulations had that player in the top two? Those are the odds we show you. How accurate are they? Not perfect, of course. The main source of error is that the individual game odds cannot be perfect. If you care about the details read on. Otherwise, suffice it to say that our estimates are probably the best available and if you’re interested in knowing who has the best chance of reaching the Candidates Match, you’re welcome to follow along with us! We will post regular updates throughout the upcoming Tbilisi event, and will show both the “current” odds and the “pre-Tbilisi” odds so that you can see exactly how much individual players have benefited or suffered from the results up until that point.

That’s a lot of text, so before we continue with the nitty gritty, here’s our “pre-Tbilisi” projections, in case you aren’t interested in clicking through to a different page to see them:

Player ODDS (PRE-TBILISI)
 Fabiano Caruana (ITA) 58%
 Alexander Grischuk (RUS) 40%
 Hikaru Nakamura (USA) 39%
 Maxime Vachier-Lagrave (FRA) 15%
 Anish Giri (NED) 14%
 Dmitry Andreikin (RUS) 10%
 Shakhriyar Mamedyarov (AZE) 6%
 Boris Gelfand (ISR) 6%
 Sergey Karjakin (RUS) 5%
 Peter Svidler (RUS) 5%
 Evgeny Tomashevsky (RUS) 2%
 Dmitry Jakovenko (RUS) 1%
 Baadur Jobava (GEO) 0%
 Leinier Dominguez (CUB) 0%
 Teimour Radjabov (AZE) 0%
 Rustam Kasimdzhanov (UZB) 0%

Nevertheless, that’s our method. And once draw rate is calculated, and given that we already have an expected score from ELO (1/(1+10^(rating differential/400))), we can easily determine the needed win and loss odds to achieve the expected score. One final important note is that we DO account for colors. White scores higher than 50%, of course, so in all our calculations of rating differential (both to determine draw rate and to determine estimated score) we add 40 points to white’s ELO before calculating the differential. A “perfectly even” match, in our estimation, with full 60% draw odds, and equal winning chances for white and black, would be a game where white is rated 2720 and black is rated 2760, for instance. So that is how we estimate the odds of each game. From there, everything else is a relatively simple Monte Carlo simulation. We run it a large number of times, and get our results!The other challenge in using ELO to estimate results of specific games is that ELO only actually gives an expected score, not an expected result. Draw rate remains an unknown. We have plans to do a detailed study of draw rates in the future, but for now we’re using a simple estimation that the base draw rate for equal players is 60%, and that draws become less likely as the gap between the players’ ratings gets wider. Specifically, the draw rate is 60% – (rating difference)/1000, so ever 10 ELO reduces the draw rate by 1 percentage point. A 2800 vs. a 2700 would presumably draw 50% of the time. This is awfully unscientific, but fortunately it doesn’t make a huge impact if we’re off slightly. We tried another simulation with a baseline draw rate of 70% instead of 60%, and only one player saw their ultimate odds shift by more than one percentage point. There’s a little error in our draw assumptions, but not a huge amount. Of course in reality, draw rates probably also vary by the individual playing styles; we would figure Jobava to draw less than our formulas predict, for example.So first of all, our estimation of the win/draw/loss odds for each game are calculated based on ELO expectation, using current live ratings from 2700chess.com (one of the greatest things on the entire internet – if you’re interested enough in chess analysis to have read this far, and for some reason you don’t already have 2700chess bookmarked, go bookmark it now. We’ll wait.) Now this isn’t perfect, it works pretty well for players rated accurately, but that can never be everyone. The ELO system is pretty solid overall, but will always contain some underrated players and some overrated players at any given time. In the long run they balance out, but in the short run an underrated player will see their odds of finishing top-two in the Grand Prix badly understated by our formulas.

One other critical factor is the pairings. These are unknown until the day before play begins. Because each tournament is a 12 player Round Robin, meaning each player plays 11 games, half the field will get the black pieces 5 times, and half the field will get 6 blacks. This draw is important, given that we (correctly) factor in the white pieces as being worth an increase in expected score in a given game. When we first posted “pre-Tbilisi” odds, on Februrary 11th, we did not yet know the pairings for Tbilisi (or for Khanty-Mansiysk of course). Foolishly, we ran our simulation with static pairings (giving the same six players the favorable treatment of getting white 6/11 games). We concluded that Grischuk (to whom we had generously given six whites in BOTH upcoming events) had a 46% chance of reaching the Candidates match. When the Tbilisi pairings were released, and we re-ran our simulation with proper Tbilisi pairings, Grischuk’s odds (he will have six blacks in Tbilisi) dropped several percentage points! This shift made it clear how important the pairings are, and so we immediately re-designed our spreadsheet so that the pairings for Khanty-Mansiysk are randomly generated for each simulation. The odds now posted reflect these dynamic pairings within the simulation (and of course the actual pairings for Tbilisi, now that they are known), and we believe they are much more accurate than our previous posting.

Advertisements

4 thoughts on “Simulating The Grand Prix – Methodology

  1. Great stuff! Will follow this site closely during the last two GP tournaments. I wonder: When you run your simulations, would it be possible to also keep track of the GP score required to qualify? I imagine this would be a good guidance tool to estimate what each player’s goal should be. My guess would be that the “cut” could fluctuate wildly between 300 and 400, but more often closer to 300.

    Like

    • I was actually just in the middle of adding code to track the players’ average scores! Yes, I’ll spend some time today building some additional data tracking functions, and put up a “rest day update” tomorrow to show off the results 🙂

      Like

  2. Your formula (1/(1+10^(rating differential/400))), the logistic distribution, is probably accurate enough for your purpose. A more accurate formula is ERF(rating differential/400)*0.5 +0.5. ERF is the error function that corresponds to the Gaussian distribution. For rating differences less than 200, your formula is mostly good to the number of decimals given in the FIDE table.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s