Forums

Chess.com ratings are so inaccurate

Sort:
David
ShaunOGX wrote:

you're on to something here mate - the ratings are not secure and I and other players I know go through periods where all the players are way stronger than their rating. Lichess is more realistic if you ask me and once my account with chess.com expires that's where I stay.

People have good days and bad days and some people can have flashes of inspiration - Emory Tate was capable of playing brilliantly but also of underperforming, so even though he would sometimes beat GMs, he never became one himself. You’ve not let your Chess.com account expire - I guess Lichess wasn’t all that different… 😎

theNobody14161

so I did run some numbers on my game stats, win% for the following categories are as follows for me:
1000-1100: 71%
1100-1200: 61%
1200-1300: 53%
1300-1400: 41%
1400-1500: 38%
There does not exist a rating which should have this kind of probability distribution, much less any rating within 200 points of 1015 (my rating at the time of compiling these numbers)

David
theNobody14161 wrote:

so I did run some numbers on my game stats, win% for the following categories are as follows for me:
1000-1100: 71%
1100-1200: 61%
1200-1300: 53%
1300-1400: 41%
1400-1500: 38%
There does not exist a rating which should have this kind of probability distribution, much less any rating within 200 points of 1015 (my rating at the time of compiling these numbers)

You lose more games against higher rated opponents - that % win graph makes sense to me.

theNobody14161
David wrote:
theNobody14161 wrote:

so I did run some numbers on my game stats, win% for the following categories are as follows for me:
1000-1100: 71%
1100-1200: 61%
1200-1300: 53%
1300-1400: 41%
1400-1500: 38%
There does not exist a rating which should have this kind of probability distribution, much less any rating within 200 points of 1015 (my rating at the time of compiling these numbers)

You lose more games against higher rated opponents - that % win graph makes sense to me.

It shouldn't. Consider this: at those percentages, if I play only 1400-1500 opponents, my rating would level off somewhere between 1340 and 1410. If I play only 1300-1400 opponents, my rating would level off around 1320, and if I play 1200-1300 opponents, around 1260, 1100-1200 oponents, 1220, 1000-1100, 1140. that is a range of nearly 300 points, merely by changing the rating who I'm playing against. there is an elo calculator here: Elo Win Probability Calculator (wismuth.com)
There is also this: The Elo rating system – correcting the expectancy tables | ChessBase 
It has a 90 point difference being a 37% odds of victory for the lower rating. So based on those probabilities I am both 90 rating points less than 1450 and 90 points more than 1050. There is no rating that matches anything close to those probabilities, my rating of 1015 at time of doing those probabilities is not even close to making sense with a single one of those probabilities. I will also add that I have played at least 50 games in each of the listed categories, so it is not a sample size issue (I did not include my win % for any 100 ranges >1500 and for <1000 due to low sample size in those ranges, if all 100 ranges are grouped into >1500 and <1000, my win % would be 57% and 82%, respectively, but low sample sizes make me very hesitant to draw any conclusions from those percentages).

David

All that means is that you've had a bad run of results lately and your rating has dipped below where it has historically been. You can see this in your rating graph. If you play more games, your rating should return to its usual level

magipi

Not only did "theNobody14161" have a massive losing streak lately, but also it was deliberate. Resign after 1 move, resign after 2 moves, game after game after game.

All this seems to be some bizarrely complicated trolling attempt.

theNobody14161
magipi wrote:

Not only did "theNobody14161" have a massive losing streak lately, but also it was deliberate. Resign after 1 move, resign after 2 moves, game after game after game.

All this seems to be some bizarrely complicated trolling attempt.

It is true that the last week or so has alot of resignations from me after starting games due to life interruptions (I recently had a child and am still adapting to frequent interruptions) however, nearly all of those resignations are against players outside of range of the probabilities above (which start at 1000, I set a different/lower opponent rating range when I fear I may be interrupted unpredictably and may be more likely to resign from interruptions. You can call this trolling if you want, I call it "if I'm distracted it can be nice to have an easier opponent, and I'd rather give rating to players with a lower rating if I need to go" I can run the numbers specifically on those games that were insta-resigned, but I'm pretty certain almost all of them were against opponents <1000 in rating and should not affect the probabilities above). That is a factor for my current rating, but it has very little to nothing to do with the probabilities in the information I provided, (except the probability of winning against <1000, which is for me is 82%, not exactly trolling levels of low), and does not solve the issue of a rating settling being too dependent on opponent ratings because win% aren't anywhere remotely close to what they should be for *any* rating to be accurate for my play. What rating should correspond to those probabilities? 1260? for about the 50% win point? but then a 1260 should only be beating 1450 players about 1/4 of the time, my % is 1.5x that. My loss rate to 1050s would also be about 1.4x what it should be of a 1260, unless ALOT of 1050 players don't have accurate ratings, or that the rating system itself means hardly anything interms of win probabilities. Regardless, there should not be an "optimal opponent range" for improving ratings without improving play. Obviously opponent rating matters a bit, hard to tell fabiano and carlson apart if they are only playing 300 rated players, but a 400 pt opponent rating range should not affect a rating settling point by 300 points. At that point, the only way to settle on a reasonable rating (within 40pts or so of true) is to play within 10 pts of one's actual rating, which defeats the purpose of ratings as an evaluation metric.

theNobody14161
David wrote:

All that means is that you've had a bad run of results lately and your rating has dipped below where it has historically been. You can see this in your rating graph. If you play more games, your rating should return to its usual level

We will know in about two weeks, provided I have time to play

magipi

Okay, so let me clarify what I meant by "trolling".

Yes complained that your results are not compatible with your rating of 1000. And they are not, the percentages shown are normal if you are 1200-1300 rated. But you were 1200-1300 rated almost all the time, except the last few days. How and why could you be silent about this? When we add this fact to the mix, suddenly there's no problem with anything at all, everything is completely normal. No rating is "inaccurate", in fact everything is almost textbook-perfect.

theNobody14161
magipi wrote:

Okay, so let me clarify what I meant by "trolling".

Yes complained that your results are not compatible with your rating of 1000. And they are not, the percentages shown are normal if you are 1200-1300 rated. But you were 1200-1300 rated almost all the time, except the last few days. How and why could you be silent about this? When we add this fact to the mix, suddenly there's no problem with anything at all, everything is completely normal. No rating is "inaccurate", in fact everything is almost textbook-perfect.

I will clarify my point in commenting:
This is a thread about rating inaccuracies. My complaint is about my game results not being compatible with ANY rating. My individual rating is not representative for a variety of reasons (listed in my comment before this one, and my later comment), my rating really isn't the issue (I did give my rating at the time of compiling, and maybe a more appropriate statistic would have been my average rating over the entire period, which in my laziness I did not calculate) I am trying to get at here, I even directly said in my comment that the best possible fit for these probabilities is a rating of about 1260, the problem is that, even that is not a good fit, in fact it is a poor enough fit such that my rating climbs considerably when I play higher rated players because I do not lose enough against them.
Textbook perfect? here is what those probabilities SHOULD look like at about the best possible fit with a rating of 1260:
1000-1100: 77% mine is 71%, which means if I keep playing opponents in this range my rating would converge lower than 1260
1100-1200: 65%, mine is 61%, same conclusion as above
1200-1300: 51% (hand picked to be close match to my win% in this range)
1300-1400: 38% vs my probability of 41%, this means if I keep playing players in this range, my rating will converge HIGHER than 1260, because I am winning more than I should against them.
1400-1500: 25% vs my probability of 38%, same conclusion as above, albeit more extreme. 50% more wins than would maintain a 1260 rating.
I obviously don't have the time to analyze this for every chess.com player, but doing so might yield some insight as to why ratings can be really inaccurate, on even a global, multiplayer scale, which is the topic of this thread. If this is commonplace across players, it could even motivate players to abort games against low-rated opponents.

David

I think what's being demonstrated here is your rating is not measuring your actual chess ability, but the results of your games against other people in that particular playing pool. The "inaccuracy" is not because of the rating but because of the way you've changed your behaviour by resigning games with little effort: the fact that you're doing this against lower rated players exacerbates the drop in your rating.

Some people do this sort of thing to qualify for a tournament with a rating ceiling below their normal rating level to maximise their chances of winning that tournament because they will then play to the best of their ability, which is higher than people who have a lower rating and have been trying their best all along. That's called "sandbagging" and is actually a violation of the fair play rules here on chess.com. It's also a fairly obvious violation, so it's relatively easy for the Fair Play team here to ban such people, but reporting people who do this does help: because it's easier for a human to identify, there's less chance of a false report muddying the waters.

theNobody14161
David wrote:

I think what's being demonstrated here is your rating is not measuring your actual chess ability, but the results of your games against other people in that particular playing pool. The "inaccuracy" is not because of the rating but because of the way you've changed your behaviour by resigning games with little effort: the fact that you're doing this against lower rated players exacerbates the drop in your rating.

Some people do this sort of thing to qualify for a tournament with a rating ceiling below their normal rating level to maximise their chances of winning that tournament because they will then play to the best of their ability, which is higher than people who have a lower rating and have been trying their best all along. That's called "sandbagging" and is actually a violation of the fair play rules here on chess.com. It's also a fairly obvious violation, so it's relatively easy for the Fair Play team here to ban such people, but reporting people who do this does help: because it's easier for a human to identify, there's less chance of a false report muddying the waters.

I don't play in tournaments, I have never really looked at it from that standpoint, can't say I have any interest in winning tournaments, especially by that means.
I am saying no rating is a good fit for my win% table, regardless of what rating my profile has. By changing the rating range of my opponents, I can make my rating converge 300 points apart from a different opponent rating range of only 400 points. This should not be the case if ratings are ever to be accurate. Then again, gotta laugh at the 3rd lowest (rank 6 of 8) rated player winning the FIDE candidates tournament, so maybe the joke is on me thinking ratings have any sense of accuracy within a 600 point range in the first place.

chesswhizz9

Really? I'm of the opinion Lichess.org ratings are inflated, I find my rating is where it should be.

Chapanejo07

Yes, but in the end the qualification does not define your level with such precision since it is affected by very personal aspects.

theNobody14161
theNobody14161 wrote:
David wrote:

All that means is that you've had a bad run of results lately and your rating has dipped below where it has historically been. You can see this in your rating graph. If you play more games, your rating should return to its usual level

We will know in about two weeks, provided I have time to play

So, finally got time to play, and yeah, raising the ratings of my opponents seems to help my rating climb considerably. While it is perhaps more evidence of a rating settling point issue inherent to rating calculations on chess.com and pertinant to this topic, I also wonder if there has ever been any attempt to standardize ratings. Is a 1200 FIDE today the same as a 1200 FIDE 20 years ago if there is nothing to stop rating deflation? Others seem to comment that 1200 on chess.com isn't quite the same as 1200 on lichess, just because the player pools are different.

wiredtearow
theNobody14161 wrote:
theNobody14161 wrote:
David wrote:

All that means is that you've had a bad run of results lately and your rating has dipped below where it has historically been. You can see this in your rating graph. If you play more games, your rating should return to its usual level

We will know in about two weeks, provided I have time to play

So, finally got time to play, and yeah, raising the ratings of my opponents seems to help my rating climb considerably. While it is perhaps more evidence of a rating settling point issue inherent to rating calculations on chess.com and pertinant to this topic, I also wonder if there has ever been any attempt to standardize ratings. Is a 1200 FIDE today the same as a 1200 FIDE 20 years ago if there is nothing to stop rating deflation? Others seem to comment that 1200 on chess.com isn't quite the same as 1200 on lichess, just because the player pools are different.

I'd say that it's slightly inaccurate until you reach a rating where your odds of winning are 50 50. If you're in a level where the odds are imbalanced, you're either better or inadequate for that rating. The beauty of it is, overtime, the more games you play, the more likelihood that you'll reach a more accurate rating. It's a self correcting system that only needs to focus on keeping the games fair and that's exactly what they're trying to do.

When it comes to standardizing ratings, FIDE doesn't necessarily need to align/standardize the rating system for itself, chess.com, and lichess. It has no real responsibility or need to do that. It could stay the way it is where people's ratings in FIDE, chess.com, and lichess are segregated. It's more straightforward that way to me. These are completely different chess environments offering very different experiences so it makes sense that the ratings from these stay individual from each other.

basketstorm
theNobody14161 wrote:
theNobody14161 wrote:
David wrote:

All that means is that you've had a bad run of results lately and your rating has dipped below where it has historically been. You can see this in your rating graph. If you play more games, your rating should return to its usual level

We will know in about two weeks, provided I have time to play

So, finally got time to play, and yeah, raising the ratings of my opponents seems to help my rating climb considerably. While it is perhaps more evidence of a rating settling point issue inherent to rating calculations on chess.com and pertinant to this topic, I also wonder if there has ever been any attempt to standardize ratings. Is a 1200 FIDE today the same as a 1200 FIDE 20 years ago if there is nothing to stop rating deflation? Others seem to comment that 1200 on chess.com isn't quite the same as 1200 on lichess, just because the player pools are different.

FIDE ratings on tops inflated a little like 50 points. On bottom - deflated a lot BUT they've fixed that this year. So in general FIDE is roughly the same it was 20 years ago.

Problem of chess.com ratings is strong pool isolation and incorrect rating initially (you don't start as unrated, you pick your initial rating). And chess.com ratings are not "self-correcting" as many fantasize. You will never reach any "accurate" level, because there are too many inaccuracies here.

HangingPiecesChomper

chess.com ratings are rigged