Is Tomashevsky Proving Us Wrong?

12 days ago, we published our “pre-Tbilisi” estimations of each player’s odds of finishing top-two in the final standings of the FIDE Grand Prix, and earning a berth in the 2016 Candidates Tournament. In that initial posting, we listed Evgeny Tomashevsky as having a meager 2% chance of climbing into that top-two tier. Today, nine undefeated rounds (and five wins) later, we just updated our odds to list Tomashevsky as having a 49% chance of reaching that same tier. In light of such a drastic shift, it makes sense to pause for a moment and ask ourselves: was the original prediction wrong?

The classic statistician’s defense would be to point out that we never claimed Tomashevsky had NO chance of reaching the Candidates Tournament (nor has he done so yet), we just considered it highly unlikely. 2% still means, though, that one time out of fifty the event WILL occur. If you make a lot of predictions, then you will see “longshots” like this come through from time to time. So one possibility is that our prediction was perfectly right, given what we knew at the time, about Tomashevsky’s odds, and what we’re seeing from him in reality was his absolute best case scenario – a highly unlikely event coming to pass.

However it would be lazy of us to simply assume that was the case. It’s also possible that we made a mistake in building our model, and incorrectly underestimated his chances originally. Maybe what he’s done so far was more likely than we originally thought. So before we write this off to simple “positive variance” for Tomashevsky, let’s dig into the model a little deeper and probe for potential errors.

The most obvious consideration is the accuracy of the ELO system. Our entire model is built on the basis of using players’ ELO ratings to predict the results of each game. Tomashevsky’s odds before the tournament began were based on his rating of 2716, third lowest in the Tbilisi field. Many studies have shown that the ELO system as a whole is quite effective at predicting results in the broad sense, and we don’t feel inclined to consider any possibility that using ELO as our basis is methodologically unsound in general, but that doesn’t necessarily mean ELO is accurate for every player at all times. In fact it’s a given that at any time there must always be some players who are overrated and some others that are underrated. An ELO-based model will, of course, miscalculate the odds involving players whose rating is not accurate to that player’s “true playing strength”. Perhaps Tomashevsky was underrated before the event began, and his expected results were artificially deflated in our model, as a result?

Here is a graph of Tomashevsky’s rating, by age, over the course of his life. The blue line shows every official published rating he has ever had, while the orange spike at the end shows his live ratings after each round at Tbilisi:


What we can see here is that a little over five years ago, at the age of 22.35 years old, Tomashevsky had a published rating of 2708 – his first rating above the 2700 barrier. His 47 published ratings since then, up to and including his February 2015 rating of 2716 that we mentioned earlier, have ranged from as low as 2695 to as high as 2740, but averaged 2714. We can drill in on just those last 5 1/2 years of his ratings history, and have Excel plot a trend line. This allows us to see whether he has been showing any signs of improvement (or decline) over the time span:


Our trend line is almost perfectly flat, at our average rating of 2714 (very slightly trending downward, actually.) In other words, we have over five years of professional chess results from Tomashevsky showing that while he has of course fluctuated above and below his average at times, he has on the whole maintained a very steady rating in the range of approximately 2715. This is quite a strong argument in favor of our model’s usage of 2716 as his baseline rating in the original predictions. Additionally, at almost 28 years old it seems unlikely that there was any particular reason to suspect that now might be a likely time for Tomashevsky to suddenly make a major improvement in his game, and begin playing at a higher level.

We have plans to eventually do a detailed study of “ratings plateaus” – long periods in a chess player’s career where he or she maintains a relatively consistent rating – in the future. Many players have notably had a prolonged plateau, and then suddenly seen their rating spike. Maxime Vachiere-Lagrave is one such example. We have plans to examine factors such as age and the length of the plateau, and evaluate whether there might be any way to predict when a player might be likely to make a “breakthrough” soon. Since that analysis lies in the future, not the present, we can’t prove our position mathematically at this time, but we suspect that any plateau model we might develop will probably NOT identify a 28 year old, whose plateau has lasted five or more years, as a “likely breakthrough candidate”.

If all our data suggests that 2716 was a very good baseline for Tomashevsky’s rating, in our initial simulations, then it suggests that our original idea might be right. The pre-Tbilisi odds were correct representations of the odds at that time, and what has happened so far truly has just been an extraordinary event. We can call it “positive variance” if we want to be clinical, or “good luck”, or “good fortune”, if we want to couch it in more mystical terms, or perhaps we could even say Evgeny has been “clutch” if we want to borrow a term from our favorite sportscasters. Regardless of the terminology, this raises a new intriguing consideration:

Might our CURRENT predictions be wrong?

Our odds of Tomashevsky reaching the Candidates Tournament depend heavily on what we think he will do in the final leg of the Grand Prix at Khanty-Mansiysk. Right now, we are predicting his results in that event using his current live rating of 2744.9, courtesy of This is the highest rating he has ever had in his life (or will be, once it is officially published by FIDE), and it comes courtesy of his remarkable results so far at Tbilisi. However, if we have established that 2715 (or so) is a good baseline for him, based on a large sample size of previous results, then isn’t he perhaps overrated now? If we’re deeming his results at Tbilisi to be above his reasonable expectation, and rejecting the theory that he has truly broken through his plateau and achieved objective 2750+ strength, then we also have to consider that by using his current live rating in our model we may be overestimating his chances at Khanty-Mansiysk. Maybe we should be expecting more regression to the mean, and lowering our expectations for him at that event, which would leave his odds of finishing in the final Grand Prix top-two at lower than our currently projected 49%.

However we have one final idea to consider. Let us return, now, to a word I used earlier: “clutch”. Borrowed from the world of sports, we often talk about athletes who have an uncanny ability to perform their best when on the biggest stage. The basketball player who takes over a playoff game in the fourth quarter, for instance. Can chess players be “clutch”, and if so, is Tomashevsky?

If we look carefully at his six-year ratings graph, we will notice three major upward spikes. On the November 2011 ratings list, he gained 30 rating points, jumping from 2710 to 2740 (his highest published rating so far in his career). That rating then slowly sank downward to a nadir of 2703, before spiking back up to 2720 in October 2013. Then his rating sank steadily once more, until it sat at 2701, and then in the November 2014 rating list it spiked again, to 2714.

What events did Tomashevsky have such great success at, to cause these three spikes? The first spike included his rating gain from the 2011 World Cup. Although he was eliminated in the third round that year, he managed a performance rating for the event of 2800. The second spike is entirely the result of the 2013 World Cup, when despite being seeded 32nd, he managed to reach the semifinals, coming just one round from earning a spot in the 2014 Candidates Tournament, and posting a performance rating of 2813. And the final spike came from his results at Baku last fall, the first leg of the current Grand Prix that all these odds we keep mentioning refer to, where his performance rating was 2792. And of course we will see a fourth spike show up on the March 2015 rating list, when his absurdly great results at Tbilisi are factored in. His performance rating through round nine is an out-of-this-world 2969!

So in other words, Tomashevsky has spent the last five plus years demonstrating consistently that he is roughly a 2700-level player when the World Championship is not in play, but in the four events he has played during that span that serve as qualifiers for Candidates Tournaments (and potentially as steps towards the World Championship) his combined performance rating has been well above 2800.

Perhaps the answer is that Tomashevsky is simply an incredibly clutch chess player, who saves all his best efforts for events that might get him closer to an eventual World Championship. If that is the case, then thanks to his performance in Tbilisi, the 2016 Candidates Tournament is now well in his sights – and woe be unto his unfortunate opponents at Khanty-Mansiysk who must play the juggernaut that is Toma With Purpose, rather than the 2700ish Tomashevsky we see the rest of the time. If this idea has genuine merit, then perhaps we need to consider that we might still be underestimating his chances.

Ultimately we have no intention of changing our model at this time, but if “Clutch Tomashevsky” is a real thing then we may well have erred when we gave him only a 2% chance before Tbilisi began. Certainly if we had instructed the model to treat him as a 2800+ player, which so far he has consistently proven to be in World Championship qualifying events, then the model would have given him much higher odds.


Tblisi Grand Prix – Second Rest Day Update

While our top story from the first rest day (the rise of Tomashevsky) has continued unabated, there is now a second key story line at play in this event: the fall of Grischuk. While Tomashevsky scored 2.5/4 and maintained his full point lead over the field, Grischuk dropped two games, scored only 1/4, and fell firmly out of contention. In this post we will be examining the deeper ramifications (both obvious and subtle) of these two players’ results, as everything else depends on them.

First the obvious: Tomashevsky’s position has improved greatly. While his lead hasn’t actually grown, it should be obvious that a full point lead over just one player, with only three games left, is a far stronger advantage than a full point lead over three different players, with seven games left. Evgeny didn’t particularly need to extend his lead, merely hold serve as others fell off the pace. In this case it was Grischuk and Giri who dropped further behind, while noone else closed the gap. How much better is Tomashevsky looking at this point, than before? In the last four rounds his odds of winning the event outright have improved from 31% to a whopping 79%! His expected score (in Grand Prix Points) is now 164 – keeping in mind that you score 170 for a clear first, and just 140 for second. Even the one time in five that he doesn’t win outright, he will almost always share first, usually with only one other player, and still bring home 155 points. His odds in the overall Grand Prix standings, meanwhile, are now up to a 42% chance of finishing top two (and qualifying for the Candidates Tournament). This is a humongous gain over his round four position, when despite his great start he had just a 17% chance, or especially over his pre-tournament odds when we evaluated him as having only a 2% chance of being a Candidate for the World Championship next year.

Here are average expected points scored at Tbilisi, and odds of winning first place outright, for all players. You may notice that it isn’t much of a race at this point.

Player Score (Out of 8) Tbilisi EV Odds of Clear 1st
 Evgeny Tomashevsky (RUS) 6 164 79%
 Dmitry Jakovenko (RUS) 5 124 5%
 Rustam Kasimdzhanov (UZB) 4.5 89 1%
 Teimour Radjabov (AZE) 4.5 87 1%
 Anish Giri (NED) 4 77 0%
 Shakhriyar Mamedyarov (AZE) 4 72 0%
 Baadur Jobava (GEO) 4 65 0%
 Leinier Dominguez (CUB) 4 57 0%
 Alexander Grischuk (RUS) 3.5 59 0%
 Maxime Vachier-Lagrave (FRA) 3 32 0%
 Peter Svidler (RUS) 3 26 0%
 Dmitry Andreikin (RUS) 2.5 20 0%

While it has little bearing on the overall Grand Prix standings, or even on the results at Tbilisi, we would like to highlight how impressively Jobava has bounced back from an atrocious start. After managing just one draw and three losses in the first four rounds, he has won three and drawn one in his last four rounds, returning to an even 50% score for the event, and bringing his live rating back above 2700.

Grischuk, on the other hand, has fared most poorly over the past four days. As the highest rated player in the field, he was expected to score well more often than not, so after round four we saw him as the most likely contender to potentially chase and catch Tomashevsky. Instead he tumbled. Where before we expected him to score an average of 113 Grand Prix Points, now we see him picking up only 59, which would not keep him alive in the overall Grand Prix standings. By virtue of his rating, he remains one of the favorites in the final event at Khanty-Mansiysk, but at this point his overall odds of reaching the Candidates Tournament via the Grand Prix are just 12%, whereas four games ago we had him at 41%. If he is going to get there, it will have to start over the final three rounds of Tbilisi. He has no realistic hope of winning this event, but if he can win a game or two and rise in the standings, it would gain him critical extra Grand Prix points that would keep him at least somewhat in contention overall, and keep his results relevant at Khanty-Mansiysk.

Here is where all 16 players in the Grand Prix stand, for average final score expectations, as well as odds of actually reaching the Candidates Tournament:

Player Baku Tashkent Tbilisi EV Khanty-Mansiysk EV TOTAL EV Top-Two Odds
 Fabiano Caruana (ITA) 155 75 100.9 330.9 68%
 Evgeny Tomashevsky (RUS) 82 164.4 60.7 307.1 42%
 Hikaru Nakamura (USA) 82 125 97.6 304.6 49%
 Alexander Grischuk (RUS) 82 59.0 97.3 238.3 12%
 Boris Gelfand (ISR) 155 15 64.1 234.1 8%
 Shakhriyar Mamedyarov (AZE) 35 125 72.0 232.0 1%
 Sergey Karjakin (RUS) 82 75 70.6 227.6 7%
 Dmitry Jakovenko (RUS) 30 123.7 62.0 215.7 5%
 Anish Giri (NED) 40 76.7 93.8 210.5 5%
 Dmitry Andreikin (RUS) 20 170 19.9 209.9 0%
 Teimour Radjabov (AZE) 50 50 86.8 186.8 0.05%
 Baadur Jobava (GEO) 75 64.7 41.2 180.9 1%
 Maxime Vachier-Lagrave (FRA) 75 31.7 74.0 180.7 1%
 Peter Svidler (RUS) 82 25.7 54.0 161.6 0.4%
 Rustam Kasimdzhanov (UZB) 35 15 88.8 138.8 0%
 Leinier Dominguez (CUB) 10 56.8 53.8 120.5 0.03%

You may notice, from the numbers above, two other major beneficiaries of Grishcuk’s collapse: Hikaru Nakamura and Fabiano Caruana. While sitting on the sidelines, these two have seen their chances improve significantly over the past four rounds, improving eight percentage points each over where they stood at the first rest day. Why? As the leaders in the overall Grand Prix standings prior to this event, they were in the best position to benefit from the rise of a relatively weak favorite at Tbilisi. While we don’t wish to take anything away from how impressive Tomashevsky has been at this event, he is still rated 50-60 ELO lower than the top rated players in the Grand Prix, and so our model projects him as less likely to repeat his success at Khanty-Mansiysk than, say, Grischuk or Giri would have been. So having more of Tbilisi’s Grand Prix points likely to be awarded to lower rated players puts Caruana and Nakamura in much better positions to maintain their leads through the final leg of the Grand Prix.

Also working in the leader’s favor has been the disappointing results for two other higher rated players: Giri and Vachier-Lagrave. While they never had as much hope originally as Grischuk did, making their drops less dramatic, they nevertheless have seen their odds of qualifying for the Candidates Tournament drop precipitously over the course of this event. Just like with Grischuk, these other drops from high rated players have also benefited Nakamura and Caruana… as well as helping Tomashevsky greatly, of course. Here is how each of the 16 players in the Grand Prix field has trended from before Tbilisi started, to the first rest day, to the second:

 Fabiano Caruana (ITA) 58% 59% 68%
 Hikaru Nakamura (USA) 39% 40% 49%
 Evgeny Tomashevsky (RUS) 2% 17% 42%
 Alexander Grischuk (RUS) 40% 41% 12%
 Boris Gelfand (ISR) 6% 6% 8%
 Sergey Karjakin (RUS) 5% 5% 7%
 Dmitry Jakovenko (RUS) 1% 1% 5%
 Anish Giri (NED) 14% 17% 5%
 Baadur Jobava (GEO) 0% 0% 1%
 Shakhriyar Mamedyarov (AZE) 6% 3% 1%
 Maxime Vachier-Lagrave (FRA) 15% 6% 1%
 Peter Svidler (RUS) 5% 2% 0%
 Teimour Radjabov (AZE) 0% 0% 0%
 Leinier Dominguez (CUB) 0% 0% 0%
 Dmitry Andreikin (RUS) 10% 3% 0%
 Rustam Kasimdzhanov (UZB) 0% 0% 0%

There is another, somewhat more subtle, ramification of Grischuk’s fall in the projected standings. It has become slightly less likely that the cutoff score needed to qualify for the Candidates Tournament will be extremely high. At the first rest day, when Grischuk (and his gaudy rating of 2810 at the time) was just one point off the lead, and still in legitimate contention, our model saw it as far more likely that he might post huge scores in both of the final two events, driving up the standards of qualification. Now of course Tomashevsky is still in a position to do just that, but with his lower rating he is projected as much less likely to follow up a great Tbilisi result with an equally great Khanty-Mansiysk result. Therefore, our projected cutoff has dropped 10 points. Our model now has the median points required to qualify for the Candidates as just 317, and now sees a (very remote) chance of someone qualifying with as little as 242 points, lower than our previously reported minimum. Of course there is still an extreme range of possible cutoffs, and noone should feel secure with a final score of, say, 320. In some of our simulations it still takes 392 points just to achieve second place! Here is a frequency graph of all the possible qualifying scores:


One final consideration that we mentioned in our last post is the rating mark of 2800. Both Grischuk and Giri have fallen below it in the current live ratings, but they still have three rounds to rebound. Both are now underdogs to get their end-of-tournament rating (presumed to become official on the next rating list) back above the 2800 line, but neither is completely eliminated. Giri would need to score 2.5/3 in the final rounds, with the black pieces in two of those three games, and our model predicts him to do just that 14% of the time. Grischuk also needs to score 2.5/3, but has the benefit of two games with white, and so has a slightly better chance at 21%. Still, it now looks likely that the official March rating list will have only two 2800+ players on it – despite there having been five at the same time rated over 2800 in the live ratings earlier this month.

All told, the results over rounds five through eight have removed much of the drama from Tbilisi. Tomashevsky is now an overwhelming favorite to win the event, and has turned himself into a legitimate contender for one of the eventual Candidates Tournament spots. Of course he still has to navigate the last three rounds: he has roughly a 90% chance of at least sharing first place, but 10% is still 10%, and he has a job to do to close out the win. The other most tangible question is whether Grischuk can bounce back, manage at least a plus score in the last two rounds, move up the standings a bit, and enter Khanty-Mansiysk as a contender. We have about 20 hours until round 9 begins. In the meantime, hopefully this update gives you something to chew on during this rest day!

Tbilisi Grand Prix – First Rest Day Update

We probably could have titled this post “The rise of Tomashevsky”. Certainly the biggest story is Evgeny Tomashevsky’s blistering 3.5/4 start, and full point lead after four rounds (out of 11). You may already know that we are tracking the Grand Prix in detail, and running simulations to determine each player’s odds of finishing in the top two of the final standings and earning a berth in the 2016 Candidates Tournament. Well four days ago, before this event began, we were awfully pessimistic about Evgeny’s chances, giving him a mere 2% chance of doing well enough both here in Tbilisi and also in Khanty-Mansiysk to finish in the top two. Now? We have his odds up to 17%! He still has an uphill climb before he convinces our ratings-based simulation model that he’s an actual “favorite”, but he has nevertheless made tremendous progress.

In honor of the first rest day (2/19/2015), we thought we’d dig a little deeper into the results of our simulations. First of all, we’ve talked a lot about the overall Grand Prix standings, but haven’t actually said a word about the Tbilisi tournament specifically. How valuable is Tomashevsky’s full-point lead, in terms of actually winning this event? Is it worth more than the higher ratings of Grischuk and Giri, each sitting on 2.5/4? Our numbers say yes! Here are each player’s odds of winning THIS tournament, along with their average Grand Prix Points earned at Tbilisi (“EV” stands for “expected value”):

Player Score (Out of 4) Tbilisi EV Odds of Clear 1st
 Evgeny Tomashevsky (RUS) 3.5 132 31%
 Anish Giri (NED) 2.5 113 16%
 Alexander Grischuk (RUS) 2.5 113 15%
 Dmitry Jakovenko (RUS) 2.5 82 4%
 Shakhriyar Mamedyarov (AZE) 2 71 2%
 Maxime Vachier-Lagrave (FRA) 1.5 65 1%
 Rustam Kasimdzhanov (UZB) 2 61 1%
 Leinier Dominguez (CUB) 2 59 1%
 Teimour Radjabov (AZE) 2 56 1%
 Dmitry Andreikin (RUS) 1.5 55 0%
 Peter Svidler (RUS) 1.5 48 0%
 Baadur Jobava (GEO) 0.5 16 0%

Poor Jobava. His slow start has been just as bad as Tomashevsky’s has been good. At this point, pretty much his best realistic hope is just to share last place with someone. You can see, though, that if there is a clear winner it will be Tomashevsky almost half the time! His lead really is quite commanding, even with seven rounds left, but certainly nothing has been clinched yet.

So if Tomashevsky does take a clear first place and earn 170 Grand Prix points, what does that mean for his overall chances of qualifying? It would give him 252 points through two events, since he earned 82 at Baku, but how many points will second place ultimately require? Well, we began tracking this in our simulations, and have determined that the average score needed for second place in the final standings is 327. So if Tomshevsky wins, he’ll still need 75 more points at Khanty-Mansiysk to reach that target. Of course there this is an oversimplification, and 327 does not magically guarantee a Candidates berth, as there is a very wide range of possibilities. Across 20,000 simulations, we saw qualifying targets as low as 255, or as high as 392! The latter value can only happen in one precise way: Grischuk must win Tbilisi outright for 170 points, then finish exactly second at Khanty-Mansiysk for 140 more, giving him 392 total, AND Caruana must win Khanty-Mansiysk outright bringing his own total to 400 points even and first place overall. Despite the fact that this is an incredibly specific series of events, our simulation shows it happening once every 200 times (0.5%). This is surprisingly often, for one exact outcome, and shows how much our model respects Caruana and Grischuk’s high ratings, in its simulations.

Here is the full range of possibilities:

Cutoff Graph

So if we know the expected average result for each player at Tbilisi, do we have it for Khanty-Mansiysk as well? Of course! Here we have every player’s actual scores for Baku and Tashkent, along with their expected scores for Tbilisi and Khanty-Mansiysk, and their average overall scores. We also included their current odds of finishing in the top two in the final standings, in order to highlight that it doesn’t necessarily track perfectly with expected scores. That is because expected scores are an average of all possible results, while top-two odds are skewed heavily towards the odds of particularly good results. Some players have a higher risk/reward factor in their remaining slate, that allows them to have higher odds of reaching the Candidates despite not having a higher expected score.

Player Baku Tashkent Tbilisi EV Khanty-Mansiysk EV TOTAL EV Top-Two Odds
 Fabiano Caruana (ITA) 155 75 99.2 329.2 59%
 Hikaru Nakamura (USA) 82 125 96.0 303.0 40%
 Alexander Grischuk (RUS) 82 112.6 104.5 299.1 41%
 Evgeny Tomashevsky (RUS) 82 131.5 56.8 270.3 17%
 Anish Giri (NED) 40 112.9 98.6 251.5 17%
 Dmitry Andreikin (RUS) 20 170 54.9 244.9 3%
 Boris Gelfand (ISR) 155 15 64.7 234.7 6%
 Shakhriyar Mamedyarov (AZE) 35 125 71.0 231.0 3%
 Sergey Karjakin (RUS) 82 75 70.0 227.0 5%
 Maxime Vachier-Lagrave (FRA) 75 65.4 77.6 218.0 6%
 Peter Svidler (RUS) 82 47.6 57.1 186.8 2%
 Dmitry Jakovenko (RUS) 30 81.6 57.6 169.2 1%
 Teimour Radjabov (AZE) 50 50 56.1 156.1 0%
 Baadur Jobava (GEO) 75 16.4 34.2 125.5 0%
 Leinier Dominguez (CUB) 10 58.6 53.8 122.3 0%
 Rustam Kasimdzhanov (UZB) 35 15 61.5 111.5 0%

And finally, we looked at one more consideration. The “2800 club”! While live ratings are something we love to follow along with, there is a certain gravitas that comes along with actual published ratings. So the question is: when the March rating list comes out, how many 2800+ players will we see? Nakamura flirted with the mark at Zurich, but fell short in the end. Caruana struggled at Zurich, but managed to stay above the 2800 plateau. Carlsen will be there of course, so that’s two. And then there are Grischuk and Giri, both of whom are playing at Tbilisi. What are their chances?

Well Grischuk is in pretty good shape. Although he loses a little bit of rating with every draw, as the highest rated player in the field, his 9.7 point cushion (current live rating of 2809.7) is enough that as long as he scores 50% the rest of the way he’ll be fine. However if he drops to a negative score, at only 3/7 or worse over the remaining rounds, his rating would fall below the magic 2800 mark. According to our simulation, his chance of scoring at minimum the needed 50% in his remaining games is 82% (as he’s a favorite in all but one of them, the lone exception being when he has the black pieces against Giri in round 8).

Giri is in a slightly tougher spot, as he has no real cushion at all, currently sitting at 2800.4 in the live ratings. Since he also loses rating points with draws most of the way, he needs at minimum a score of +1, or 4/7 the rest of the way, to keep his rating afloat above the 2800 mark. Actually, 4/7 would drop his live rating to 2799.8, but fortunately that’s good enough as FIDE would round up. Scoring +1 is a tougher task than just maintaining an even score, but Giri is favored in all 7 of his remaining games (thanks to having the white pieces, which we rate as being worth 40 rating points) against Grischuk. As such, Giri is a favorite to score at least the 4/7 he needs: we rate his chances of a published March rating of 2800+ to be 63%.

We hope you enjoyed this interlude as a palatable replacement for actual chess on this rest day. Perhaps turn your attention to the rapid games at Zurich, to keep yourself entertained. We will post another update at the second rest day, so please let us know if there are any other stats you’d like us to take a look at!

Simulating The Grand Prix – Methodology

On our main page for the 2014-15 Grand Prix, you will notice that we list each player’s odds of finishing in the top two of the final Grand Prix standings – an important mark because those top two players earn berths in the 2016 Candidates Match, with a chance at the World Championship. If you’ve seen this, perhaps you have wondered: where did those numbers come from? How are they calculated, and how accurate are they?

So here I will discuss our methodology. First of all, the overview is relatively simple. I’ll go over that portion first, and dive deeper into the details (that may less interesting to some readers) afterward. We have built a spreadsheet that estimates the odds of a white win, a black win, or a draw, for all 132 Grand Prix games left to play (66 in Tbilisi, 66 in Khanty-Mansiysk). Further, it can use those odds to randomly calculate a result in each game, AND correctly calculate how the Grand Prix points would be awarded given those results, and who would therefore be the top two finishers.

Our simulation simply re-runs the randomizer a large number of times, recording the top two finishers each time, and spits out a result for each player: what percentage of the overall simulations had that player in the top two? Those are the odds we show you. How accurate are they? Not perfect, of course. The main source of error is that the individual game odds cannot be perfect. If you care about the details read on. Otherwise, suffice it to say that our estimates are probably the best available and if you’re interested in knowing who has the best chance of reaching the Candidates Match, you’re welcome to follow along with us! We will post regular updates throughout the upcoming Tbilisi event, and will show both the “current” odds and the “pre-Tbilisi” odds so that you can see exactly how much individual players have benefited or suffered from the results up until that point.

That’s a lot of text, so before we continue with the nitty gritty, here’s our “pre-Tbilisi” projections, in case you aren’t interested in clicking through to a different page to see them:

 Fabiano Caruana (ITA) 58%
 Alexander Grischuk (RUS) 40%
 Hikaru Nakamura (USA) 39%
 Maxime Vachier-Lagrave (FRA) 15%
 Anish Giri (NED) 14%
 Dmitry Andreikin (RUS) 10%
 Shakhriyar Mamedyarov (AZE) 6%
 Boris Gelfand (ISR) 6%
 Sergey Karjakin (RUS) 5%
 Peter Svidler (RUS) 5%
 Evgeny Tomashevsky (RUS) 2%
 Dmitry Jakovenko (RUS) 1%
 Baadur Jobava (GEO) 0%
 Leinier Dominguez (CUB) 0%
 Teimour Radjabov (AZE) 0%
 Rustam Kasimdzhanov (UZB) 0%

Nevertheless, that’s our method. And once draw rate is calculated, and given that we already have an expected score from ELO (1/(1+10^(rating differential/400))), we can easily determine the needed win and loss odds to achieve the expected score. One final important note is that we DO account for colors. White scores higher than 50%, of course, so in all our calculations of rating differential (both to determine draw rate and to determine estimated score) we add 40 points to white’s ELO before calculating the differential. A “perfectly even” match, in our estimation, with full 60% draw odds, and equal winning chances for white and black, would be a game where white is rated 2720 and black is rated 2760, for instance. So that is how we estimate the odds of each game. From there, everything else is a relatively simple Monte Carlo simulation. We run it a large number of times, and get our results!The other challenge in using ELO to estimate results of specific games is that ELO only actually gives an expected score, not an expected result. Draw rate remains an unknown. We have plans to do a detailed study of draw rates in the future, but for now we’re using a simple estimation that the base draw rate for equal players is 60%, and that draws become less likely as the gap between the players’ ratings gets wider. Specifically, the draw rate is 60% – (rating difference)/1000, so ever 10 ELO reduces the draw rate by 1 percentage point. A 2800 vs. a 2700 would presumably draw 50% of the time. This is awfully unscientific, but fortunately it doesn’t make a huge impact if we’re off slightly. We tried another simulation with a baseline draw rate of 70% instead of 60%, and only one player saw their ultimate odds shift by more than one percentage point. There’s a little error in our draw assumptions, but not a huge amount. Of course in reality, draw rates probably also vary by the individual playing styles; we would figure Jobava to draw less than our formulas predict, for example.So first of all, our estimation of the win/draw/loss odds for each game are calculated based on ELO expectation, using current live ratings from (one of the greatest things on the entire internet – if you’re interested enough in chess analysis to have read this far, and for some reason you don’t already have 2700chess bookmarked, go bookmark it now. We’ll wait.) Now this isn’t perfect, it works pretty well for players rated accurately, but that can never be everyone. The ELO system is pretty solid overall, but will always contain some underrated players and some overrated players at any given time. In the long run they balance out, but in the short run an underrated player will see their odds of finishing top-two in the Grand Prix badly understated by our formulas.

One other critical factor is the pairings. These are unknown until the day before play begins. Because each tournament is a 12 player Round Robin, meaning each player plays 11 games, half the field will get the black pieces 5 times, and half the field will get 6 blacks. This draw is important, given that we (correctly) factor in the white pieces as being worth an increase in expected score in a given game. When we first posted “pre-Tbilisi” odds, on Februrary 11th, we did not yet know the pairings for Tbilisi (or for Khanty-Mansiysk of course). Foolishly, we ran our simulation with static pairings (giving the same six players the favorable treatment of getting white 6/11 games). We concluded that Grischuk (to whom we had generously given six whites in BOTH upcoming events) had a 46% chance of reaching the Candidates match. When the Tbilisi pairings were released, and we re-ran our simulation with proper Tbilisi pairings, Grischuk’s odds (he will have six blacks in Tbilisi) dropped several percentage points! This shift made it clear how important the pairings are, and so we immediately re-designed our spreadsheet so that the pairings for Khanty-Mansiysk are randomly generated for each simulation. The odds now posted reflect these dynamic pairings within the simulation (and of course the actual pairings for Tbilisi, now that they are known), and we believe they are much more accurate than our previous posting.

Prodigy Profile: Alireza Firouzja


When we think about countries with strong chess traditions, Iran does not spring quickly to mind. More likely we think of Russia first, then other former Soviet nations, along with India and China, and maybe the United States. As for Iran? Well, they were seeded 39th at the 2014 Olympiad, and finished in the massive tie for 35th through 60th places. With just 9 GMs in the country, and only two players currently rated 2500+, Iran is not exactly a chess powerhouse. Right now.

The next decade could well see that change, however. Iran boasts 5 of the 65 highest rated players in the world born this millennium, along with a handful of other talented players sitting just outside the top-100 of the U15 rankings. And the star of this group of promising youngsters is Alireza Firouzja, currently the #1 ranked U12 player in the world!

Firouzja’s peak rating of 2332, achieved in the November, 2014 rating list, is the 11th highest rating ever achieved by a player under 12 (higher than Wei Yi or Magnus Carlsen’s ratings at the same age). While the rating dipped in January, even Firouzja’s current 2305 mark is good enough to rate as the 11th highest at or before his current age of 11.63 years.

At the time of this post, Firouzja is currently competing in the top group of the Khazar International Open, where he will have the opportunity to compete against foreign GMs and perhaps achieve a new higher peak rating with a strong result. And perhaps in another decade Firouzja might find himself headlining an Iranian team that is placing significantly higher than 35th in a Chess Olympiad!

Note: This youngster’s name is correctly spelled “Alireza Firouzjah”, per Iranian chess news site Achmaz, but for this article we will use the name as it appears in FIDE’s registry.

Prodigy Profile: Sam Sevian


GM Sevian is the youngest grandmaster in USA history, achieving the title at the age of 13 years, 10 months, and 27 days. In November of 2012, his rating was so high for his age group that he managed to win the U12 World Championship, scoring 9 out of a possible 11 points, and he still LOST rating points. His prodigy status peaked with his rating of 2385 on the January 2013 rating list, published when Sevian was 12.02 years old. At that point he was tied with Peter Leko as the fifth highest rated player ever, at or before his age.

Between Sevian’s 12th and 13th birthdays, his stock as a potential record setter slipped slightly, as he picked up only 16 rating points in the 12 month span, but over the last year he has gone on a tear. Since the beginning of 2014, Sevian has picked up his IM and GM titles, and gained 130 rating points in the last 13 months (as of the February 2015 rating list). His rating is just two points shy of being among the 10 best ever achieved at his current age, and only 50 points behind Magnus Calsen’s record for highest rating EVER by 14 years and 1 month old. If he keeps racking up rating points at this average of 10 per month, he may yet have hope of setting a world rating record. 110 points in the next 11 months, for instance, would put him at 2641, which would break Wei Yi’s record of the highest rating ever achieved at the age of 15.

Most recently, Sevian tied for 5th place (out of 14 players) in the Challengers section of the prestigious Tata Steel tournament at Wijk Aan Zee. This despite being only the #12 seed by rating. What does the rest of 2015 (and beyond) hold for the young American superstar? We can only wait and see.

Playing With Date Arithmetic (And The 10 Youngest 2600+ Players Ever)

In my future posts, you’ll probably hear a lot about players’ ages at the time they achieve various milestones. I will present those dates as numbers with (usually two) decimal points, rather than the common Years-Months-Days format. The latter is more easily understandable: we know exactly what it means when we say that, for instance, Wei Yi is the youngest player to achieve a rating of 2700+, at the age of 15 years, 8 months, and 29 days*. Saying that he achieved that milestone at the age of 15.75 years old actually makes this a poor example, because that’s pretty clear, but if the number were 15.63 instead it would be harder for our brains to immediately process it. We’re not used to thinking of ages in terms of fractions of years (except the “big” fractions like 1/2, 1/4, 3/4). We break years down into months, not increments of 0.01 year, and we break those months down into days.

Unfortunately, the Years-Months-Days format is also less accurate. To see why, let’s consider another example. Here are the 10 youngest players to achieve a rating of 2600 or higher:

Youngest Published 2600+ Rating
Player Name Age
Wei Yi 14.42
Wesley So 14.98
Teimour Radjabov 15.06
Magnus Carlsen 15.09
Sergey Karjakin 15.22
Ruslan Ponomariov 15.23
Illya Nyzhnyk 15.59
Fabiano Caruana 15.67
Anish Giri 15.67
Peter Leko 15.81

Note that there is an apparent tie for 8th place between Caruana and Giri. Were they actually the exact same age? Well, let’s look at Year-Month-Day format first. Caruana was born 7/30/1992, and achieved this milestone on the 4/1/2008 rating list. From July 30 1992 to March 30 2008 is 15 years and 8 months. March 30 to April 1 is two days, so he was 15 years, 8 months, and 2 days old. How about Giri? Born 6/28/1994, he broke the 2600 barrier on 3/1/2010. From 6/28/1994 to 2/28/2010 is 15 years, 8 months, and we add 1 day from 2/28 to 3/1. So it would appear that Giri achieved this milestone 1 day earlier than Caruana did! The chart above placed them in the wrong order, right?

Well, no. Let’s break Caruana’s age down further. The “15 years” component from 7/30/1992 to 7/30/2007 is 15 * 365 = 5475 days, right? Nope. Leap Year exists! That particular span includes three extra days: 2/29/1996, 2/29/2000 and 2/29/2004. So “15 years” in this case means 5478 days. What about the “8 months” portion? Well the months whose last day was included in that span are June through February, meaning we get 5*31 + 2*30 + 1*29 = 244 days out of the span (remember that 2008 was also a Leap Year). Finally we add in the “2 days” portion, for which no breakdown is needed, and we see that Caruana achieved his first 2600+ rating when he was 5478 + 244 + 2 = 5724 days old.

Giri’s “15 years” include four Leap Years, not three, and his “8 months” do not include a February, which adds two days to his total. So “15 years, 8 months, and 1 day” is, in his particular case, 5725 days. Despite appearing to have been one day younger, using the more common format, it turns out that Giri was actually one day OLDER than Caruana. My chart above is correct after all.

Now of course it doesn’t matter at all which of two amazing players, currently ranked #3 and #4 in the World, got to 2600 one day faster than the other. However this example serves perfectly to demonstrate why I will not use the Years-Months-Days format to express players’ ages. In fact, behind the scenes, all my ages are simply number of days, but I don’t imagine anyone wants to know that Wei Yi was 5751 days old when he broke the 2700 barrier, so I divide ages by 365.25 (to account for Leap Year) and present them as just “years old”, rounded to the appropriate number of decimal places for the particular purpose in play.

I hope you appreciate my precision.

*This isn’t technically true yet, but it appears that it will become true on March 1st, when the next FIDE rating list is published.