Talk:Elo rating system/Archive 2

This is an archive of past discussions about Elo rating system. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

College ranking by Elo points

Basically, we assign wins and losses based on the matriculation decisions of admitted students, and we then rank the colleges based on their Elo points. The 2009 ranking list is here: http://college.mychances.net/college/college-rankings.php . My obvious COI prevents me from adding this to the article, but I thought I'd put it here in the discussion section in case there is interest. (This approach to college rankings has been undertaken once before in an academic paper by Avery, Glickman, Hoxby, and Metrick; theirs was a one-time study with 1999 admissions data.) 155.52.208.80 (talk) 23:29, 18 October 2009 (UTC)

I couldn't access the website. The ELO system is designed for when there is a head-to-head competition between two people or whatever. How does this fit in? Do you go by the number of students that choose college A over B or something? Bubba73 (talk), 23:35, 18 October 2009 (UTC)

Hmm, sorry about the link; I haven't set up a cache for the page yet, so the build time can be slow (2+ seconds). I'll post a partial list w/ Elo points below. The tournament structure is this: each college plays a head-to-head competition with every other college that shares a co-admitted student. In such a tournament, the colleges are the players, and each co-admitted student becomes a pairwise "match". The decision of the co-admitted student creates a win for one school and a loss for the other. Iterate through all pairs of colleges that are 'joined' by a co-admitted student, and this is how our rankings are determined. By using this framework, the system is directly analogous to the chess tournament system in which Elo points are classically used.

One interesting aspect of this approach is that, given knowledge of the Elo points of two different schools, one can perform a simple transformation to extract the relative matriculation likelihood. In other words, you can see the relative yield for students admitted to both schools.

In case there are still pageload issues, here are the top 10 schools using 2009 data (i.e., the data from students who just started attending college this fall):

Rank	College	Elo points
1	Harvard	2009
2	Yale	1852
3	Stanford	1834
4	Princeton	1788
5	Dartmouth	1767
6	U Penn	1757
7	Notre Dame	1750
8	Columbia	1742
9	Georgetown	1715
10	UC Berkeley	1711

132.183.98.85 (talk) 21:37, 19 October 2009 (UTC)

Interesting, and that seems to be a valid use of the rating system. Bubba73 (talk), 00:12, 20 October 2009 (UTC)

Hmm, intuitively you would expect Harvard and Yale to be grandmasters, not a low expert and low A player, respectively. But of course, largely for reasons of economics, fancy schools often "lose" to lower-rated but much less expensive state schools. I expect that is why the rating field is as "flat" as it is. Krakatoa (talk) 16:14, 22 October 2009 (UTC)

Very interesting observation. A few more details: 1500 is our initialization score, and K is fixed at 24 (no introductory period currently and no variable K). It is possible that not enough tournaments have been played for the schools to depart as significantly from 1500 as they merit. If the upper echelon is actually more preferred than we're making it look, it should come out when we have more data (next year's results will come back sometime between May-June 2010). 132.183.98.85 (talk) 19:03, 22 October 2009 (UTC)

I think the figures are about right. Would only expect a grandmaster rating for about 1 in 1,000,000+ and because there isn't that many schools then it was similiar to expected. SunCreator (talk) 19:45, 22 October 2009 (UTC)

Yes, I was going to point out some factors, one was the average rating and K factor chosen, as you said. Other factors: the larger the number of players the greater chance that some will be very far above average; and students make choices based on criteria other than the quality of the school (money, distance, the city, friends and family, etc). Bubba73 (the argument clinic), 19:46, 22 October 2009 (UTC)

Color

The article should at least mention that most rating systems ignore color, even though it is well known that having the first move is a measurable advantage. IIRC, in Elo's book for his original system he calculates that playing white yields an advantage of almost exactly 50 rating points. The Chessmetrics system does deal with the problem. --71.174.162.36 (talk) 09:26, 22 October 2009 (UTC)

It affects individual games but it evens out over the long run. Bubba73 (the argument clinic), 14:11, 22 October 2009 (UTC)

This is not an article about chess ratings so I don't see the colour is relevant. SunCreator (talk) 15:30, 22 October 2009 (UTC)

Contradictory Example

In the subsection about the USCF ratings system, the following two lines are written:

Class B and higher is generally considered extremely competitive and the USCF establishes a rating floor of each player's peak rating minus 200 rating points. For instance, once someone has reached a rating of 1400, they can never fall below 1200 for rating and competition purposes though there isn't a rating floor for those 1200 and below.

However, class B does not start until a rating of 1600 points. Either this statement is supposed to read "Class C and higher" or the points used in the example are off by 200 and it should read "reached a rating of 1600, they can never fall below 1400." I do not know what the case is, because I do not know too much about the USCF ratings system, but this should be fixed at some point. Schmittz (talk) 18:55, 5 July 2010 (UTC)

Nevermind, I just found a page on Mark Glickman's (current chair of the USCF's ratings committee) personal page explaining the USCF's procedure for flooring player's scores. I will edit the relevant section accordingly now. Schmittz (talk) 19:01, 5 July 2010 (UTC)

League of Legends

I think League of Legends could also get a mention in the last part, as it is another major online game which uses the ELO rating system. I'll do this myself when time allows, though. Icerazor (talk) 17:51, 15 July 2010 (UTC)

Confusion About The Confusion

Is there really any likelihood that the "ELO rating system" will be confused with the acronym for the 70's band "Electric Light Orchestra"? --BadSanta

It seems to me that the disambiguation at ELO should be enough, since to even get to this page you have to say something about rating systems. Does the Electric Light Orchestra page need a link to this one for people who want to know about chess ratings?

ICC

I think that ICC is mentioned in the article more times than it would appear necessary (compare how many times its closer competitors, Yahoo Chess, Playchess and FICS are mentioned). This also gives the impression that ICC is some sort of reference while in fact there are other chess servers with more users (Yahoo and Playchess) or equally respected (FICS, Playchess). A commercial site like this shouldn't have so much prominence in the article without a real justification.

Focus of the article is wrong

This article is about the elo rating system, not chess rating system, its ok to put its origin but thats all. —Preceding unsigned comment added by 24.203.166.47 (talk) 16:19, 30 July 2010 (UTC)

Pronunciation

Why isn’t a pronunciation given? How is Elo pronounced? /ɛloʊ/? /iːloʊ/? /eɪloʊ/? —Frungi (talk) 03:08, 22 July 2010 (UTC)

I don't know. I've always heard Americans say "Eee Low", but we don't know much about pronouncing foreign names. His name is Élő (Hungarian), but I don't know how to pronounce that properly. Bubba73 ^{(You talkin' to me?)}, 03:13, 22 July 2010 (UTC)

The counterexample of "Performance can't be measured absolutely"

User:75.4.243.4 has with this edit removed the sentence

Performance can't be measured absolutely; it can only be inferred from wins, losses, and draws. Ratings therefore have meaning only relative to other ratings.

with the edit summary

the logic is invalid; an easy counterexample is win percentage, which is an absolute rating that has meaning outside of comparison with other ratings.

Although the deleted sentence was unsourced, I think it was supported by the Oxford Companion (which I don't have on hand at the moment but will check as soon as I do) under the entries of "grading" or "rating" (can't remember, sorry). However, I feel the "easy counterexample" is also flawed, because plain win percentage does not take into account the strength of the opposition. Beating up novice players and scoring 90% against them is a lesser achievement than being invited to Linares and scoring 10% against the elite grandmasters.

I suggest restoring the sentence, but adding a footnote citation on it. It would be appreciated if someone can source it faster than I can. We could also replace "Performance" with "A player's strength". Sjakkalle (Check!) 08:19, 22 July 2010 (UTC)

Performance can't be measured absolutely; it can only be inferred from wins, losses, and draws.

Yes, this is true. It is impossible to know exactly how powerful someone is in chess unless they have solved it. This is not relevant to the claims following it though. Perhaps it could be in a paragraph by itself.

Ratings therefore have meaning only relative to other ratings.

If this is true, then if you see only a single rating in your entire lifetime, you should have no idea what it means. However, I can take the ELO ratings of everyone and give everyone a percentile. Then I can give you the percentile of Carlsen, and you will know he is a good player.

It is arguable that because I told you I am using percentiles, you already know the bound on the ratings, thus automatically granting you information on other ratings. Is this a legitimate argument? 75.4.243.4 (talk) 08:57, 22 July 2010 (UTC)

Thanks, I think we're getting somewhere. :-) Trying to remember the content of the Oxford Companion, I think you are right that content supporting "Ratings therefore have meaning only relative to other ratings" is not in there. I believe that what it means is that you could for example add 5000 points to everybody's rating, without changing anything, but that idea is already covered in the next sentence, that the range and average are arbitrary. The (minor) quibble I have is that the rating range is, theoretically,

[-\infty ,+\infty ],

as there is no theoretically highest or lowest rating (although most systems have implemented rating floors of some sort). But your point is well taken, if you know that the de facto rating range is 100-2900, then you can infer from Carlsen's rating of 2826 that he is a very strong player without looking at anybody else's rating. I wonder if a paragraph like this addresses the problems:

Performance can't be measured absolutely; it can only be inferred from results against other players. Both the average and the spread of ratings can be arbitrarily chosen. Elo suggested scaling ratings so that a difference of 200 rating points in chess would mean that the stronger player has an expected score (which basically is an expected average score) of approximately 0.75, and the USCF initially aimed for an average club player to have a rating of 1500.

The main difference to the current text is removing the offending sentence ("only meaning relative to other ratings"). It also replaces "wins, losses, and draws" with "results against other players", since who you meet is almost as important as the result you score against them. Sjakkalle (Check!) 09:44, 22 July 2010 (UTC)

I think your changes are good. There is a sudden subject shift from the first to the second sentence, but readers can cope.

If you want to have a strictly correct sentence, you can say "A player's strength can't be measured absolutely in chess; it can only be inferred from results against other players". In games such as 100 meter dash, people's performances are largely independent of opponents. 75.4.232.20 (talk) 19:27, 24 July 2010 (UTC)

The Oxford Companion does say that ratings are relative and not absolute. Bubba73 ^{(You talkin' to me?)}, 03:21, 26 July 2010 (UTC)

The Oxford Companion agrees that players can only be measured against each other. I have added a couple of sentences which follows the Oxford entry on "rating" very closely: "A player's rating depends on the ratings of his or her opponents, and the results scored against them. The relative difference in rating between two players determines an estimate for the expected score between them." The sentence removed earlier went one step further: "Ratings therefore have meaning only relative to other ratings." The problem is that a rating on its own does have some meaning, for instance the rating criterion for being awarded an FM title is placed "absolutely" at 2300. Sjakkalle (Check!) 08:02, 27 July 2010 (UTC)

How many games before an Elo rating

In chess....how many games are required? Four? Regards, SunCreator ^(talk) 13:35, 6 August 2010 (UTC)

In the USCF you have to have four games to get a rating. Then it is provisional until you have 25 rated games. I don't know about FIDE. Bubba73 ^{(You talkin' to me?)}, 14:36, 6 August 2010 (UTC)

There is a blog here which describes the rather complex FIDE rules. Apparently nine games against FIDE rated opponents are needed. For a tournament result to count for a new rating, at least three games against FIDE rated opponents are required, and at least half a point must be scored against them (one point is required in the first tournament to "get started"). I know that in Norway you need at least 10 games to obtain an Elo rating, and at least 30 before the rating is established and calculated according to the regular Elo formula. Sjakkalle (Check!) 17:31, 6 August 2010 (UTC)

Why isn't Bradley-Terry mentioned?

Isn't the Bradley Terry model the same as the model that Elo adopted. Why is this not mentioned anywhere?

Paulipu (talk) 04:36, 11 February 2011 (UTC)

I've never heard of it. Do you have a reference? Bubba73 ^{You talkin' to me?} 05:12, 11 February 2011 (UTC)

There is a Wikipedia article on pairwise comparisons that talks about it in the section about probabilistic models. A reference to the paper by Bradley and Terry is at the end of the article. It's a famous model, and it seems to predate Elo's algorithm (1952). Paulipu (talk) 05:39, 12 February 2011 (UTC)

I haven't looked at it closely enough to tell if it is the same method. Bradley & Terry aren't in Elo's book (at least not in the index). Bubba73 ^{You talkin' to me?} 01:39, 14 February 2011 (UTC)

The Social Network

Perhaps mention The Social Network and Facemash? Also perhaps some discussion on the equation as written in the movie being a bit incorrect. Aepryus (talk) 17:06, 22 May 2011 (UTC)

The math behind the 400 rule

It seems like it makes winning against much stronger players (>400 of difference) exponentially less and less significant. Wouldn't that discourage players from competing against much stronger players? In the sense that a reasonable player would never compete against a player with more than 400 of difference, if he has a choice of competing against another player with a smaller difference. I'm just wondering if there is a logic behind that. —Preceding unsigned comment added by 41.239.69.171 (talk) 21:23, 4 November 2010 (UTC)

I read that the idea behind it is if the ratings are close to accurate, beating someone more than 400 higher is a fluke, and shouldn't be weighted too heavily. Also the lower-rated player has practically nothing to lose and a lot to gain, rating wise. And you learn by playing a higher-rated player. Bubba73 ^{You talkin' to me?} 22:34, 4 November 2010 (UTC)

To call it "just a fluke" is to hugely misunderstand the statistical ideas of Elo and of chess. Carlsen should beat anyone that's 1200 or lower 9,999/10,000 times, provided he was taking each game seriously - I am not coming up with this 10,000 number as a random guess, every 400 points in difference leads to a factor of 10 in expectancy of scoring against someone.

1/10,000 times he would lose due to an absurd blunder on his part (known to happen to the strongest), and the other player playing the game of his life. Alternatively, he could draw twice in 10,000 games, which might in fact be more likely. This should be reflected in the rating system, ie. he gets a large amount of points if he wins, and Carlsen would get 1/10,000 of that for every win he gets. - And remember that 10,000 games is one HELL of a lot of games, people don't play that amount of FIDE games in their entire career, even 100 games is more than you might think, so yes it could happen. Anonywiki (talk) 21:20, 18 November 2011 (UTC)

ELO inflation

(Sorry cause my poor english)

Hello,

I listened about ELO inflation as some inaccurate of the system but after read about it is not what I spected. The firsts sites appears in Google: http://members.shaw.ca/redwards1/ http://www.chessbase.com/newsdetail.asp?newsid=5608 Talk about the "inflation" of just the ELOs of the best players! I understood as inflation a grow of the mean, not a grow of the best ELOs. The grow of the best ELOs is, in my point of view, a too logical consecuence of two things: 1-There are more players now than before so it is more possible that talent guys that not player chess with years ago situation, now play. 2-Actual "real" level of the best players is higher than the level of old best players cause the progress of the way to be prepared to the competition.

In anyway, exists a simple method to determine if this inflation exists as an inaccurate of the system or just cause really the actual level of the best is best than before: By construction, the ELO system just says that a gap of 200 points in ELO means that the better ELO guy takes the 75% of the points against the lower ELO guy (see this same site). So if that inflation is an inaccurate of the system and the borders was wrongly expanded, now a 200 points gap means less "level" gap than before. So, to refuse or proof that inaccurate expanded of the borders, you just need to compare older and newer results between guys with 200 points ELO gap and then can stablish: 1-If actual results between 200 ELO points gap guys are more tie than in the pass, then that inflation exists. 2-If actual results between 200 ELO points gap guys are the same than in the pass, then that inflation is a lie.

Regards,

Luis from Paraná, Argentina. — Preceding unsigned comment added by Luis Babboni (talk • contribs) 12:23, 28 July 2011 (UTC)

Deflation explained

The article talks a lot about deflation, but not about the causes. The cause(s) of deflation are fundamental to the discussion and should be included, even if to say nothing else than the causes are not known (if that is the case). - 98.225.38.209 (talk) 00:50, 12 September 2011 (UTC)ATBS

League of Legends Elo usage.

The online competitive game League of Legends also uses the Elo system for their ranked games, although modified to fit their needs: http://na.leagueoflegends.com/learn/gameplay/matchmaking I would just add this but I've never done formatting/editing on wikipedia before. I'll have to look up the wiki page for that soon... 72.226.12.214 (talk) 04:56, 19 December 2011 (UTC)Mnenomenon

EVERY MATCHMAKING SYSTEM IN GAMING USES ELO!

-> No, your moron.

Starcraft 2's Use of Elo

From this [[1]] I can't tell if SC2 uses Elo, but it very clearly doesn't matter whether Elo or TrueSkill is used to calculate ratings, because there are many other things going on. It seems to me that SC2 uses a modified TrueSkill system that only accounts for limited game histories, but then has other mechanics going on as well such as a second "hidden" rating (with a much larger K value), and with bonus pool points (described in the blog) that would help to obfuscate the actual workings. Given all of this, it seems like it may be preferable to simply not list Starcraft 2 on this page, because until Blizzard makes transparent how their system works, claiming that it is backed in Elo vs TrueSkill is clearly speculation. 128.101.35.157 (talk) 18:48, 12 December 2011 (UTC)

EVERY MATCHMAKING SYSTEM IN GAMING USES ELO! — Preceding unsigned comment added by 195.178.239.244 (talk) 14:04, 8 May 2012 (UTC)

--> No, your moron. — Preceding unsigned comment added by 187.57.183.170 (talk) 15:13, 19 June 2012 (UTC)

Lead-in

The lead-in of this article is not particularly useful. I came looking for a brief overview of what the Elo system is and how it works, instead I get some biographical information, talk about how different organisations use different types of calculations and something about the maths behind it - no actual definition. Find it problematic, but can't correct it myself since I have no knowledge of how Elo works - which is why I came here in the first place! --109.148.43.57 (talk) 21:11, 28 December 2011 (UTC)

I'll try to add a paragraph to the lead giving the general idea. The details are in the body of the article. Bubba73 ^{You talkin' to me?} 21:46, 28 December 2011 (UTC)

Done Bubba73 ^{You talkin' to me?} 00:13, 29 December 2011 (UTC)

Pronunciation...

Am I the only one who's pronounced it "E.L.O. rating" for a long time? I always thought it was an initialism till reading this article. Searching google, I see about every 3rd to 5th result has it in all caps as if it's an initialism. I swear I've heard people in chess pronouncing it that way too. Weird. Dancindazed (talk) 05:38, 27 April 2012 (UTC)

It is a person's name, but for many years it was always in all caps (ELO), which makes it look like an acronym, but it isn't. Bubba73 ^{You talkin' to me?} 05:46, 27 April 2012 (UTC)

I know it's someone's name now. That's why I said "till reading this article." I'm just wondering how common of a mistake this is. (Is it common enough to make a point of contradicting it in the article?) Also, why do organizations write it in all caps since it's not an initialism? (Note that an acronym is an initialism which is also pronounced as a word. I think those who think it stands for something are probably saying E-L-O as three seperate letters, not treating it like an acronym.) Dancindazed (talk) 22:58, 1 May 2012 (UTC)

I don't know how common it is or know of any references for that. I think Elo himself wanted the system to be ELO (caps). It was commonly written that way until relatively recently. Bubba73 ^{You talkin' to me?} 00:20, 2 May 2012 (UTC)

Chess photo

I no like the photo. I'll like there to be real Chess position, where the figures are exactly in the centre of squares. The Chess figures do not made the game, but the combination of figures and rules. This design where the figures are placed like in 'still life' art by some artist give wrong inspiration what exactly is Cheese game.
For comparison, I adore the 'Go' photo. I don't know to play 'Go', but I see the photo and I know what it is about. The chess photo is misleading. — Preceding unsigned comment added by 46.10.229.1 (talk) 16:16, 19 June 2012 (UTC)

How the normal/logistic distribution effects？

There is only one formula for expected score as this: $E_{A}={\frac {1}{1+10^{(R_{B}-R_{A})/400}}}.$ If normal distribution and logistic distribution use the same formula, what's the difference between them? I come for more mathematical details. --Waterture (talk) 09:50, 10 July 2012 (UTC)

Jeff Sonas

Regarding "The chess statistician Jeff Sonas believes that the original K=10 value (for players rated above 2400) is inaccurate in Elo's work" -- is Sonas' opinion well-supported by independent sources? Shawnc (talk) 09:28, 21 July 2012 (UTC)

I hope someone else gives you a more definitive answer, but I think the answer is yes. Jeff Sonas is considered an expert on chess rating, and I believe that FIDE has consulted him on this subject. I also think that FIDE is currently either experimenting with or perhaps has made a change in the K-value used in chess ratings. I'm sure there are people here more geeked up on the details of the rating system who can provide details. If sources are available, some of this should go in the article. Quale (talk) 19:34, 21 July 2012 (UTC)

References

We should discuss this rather than revert each other. Toccata, in this case I have to agree with BarrelProof. Template:Refimprove at the top of the page is not helpful for this article as it has 46 inline cites. Use inline Template:cn where needed to point out specific issues. Quale (talk) 02:37, 4 June 2013 (UTC)

"Ratings inflation"

I have just noticed that this article uses "ratings inflation" instead of "rating inflation". Google tells me that the former is used more than the latter, and I have seen both spellings used in respectable sources. Are both grammatically correct? Toccata quarta (talk) 22:00, 30 April 2013 (UTC)

It's very late in coming, but I thought your question deserves some response. As far as I know, both expressions are grammatical in English. Quale (talk) 15:45, 25 June 2013 (UTC)

Tweaks to improve readability of math formulae section.

I tweaked some formatting (layout) in the Performance_rating section with mathematical formulae because I had trouble reading the section because of tight spacing between lines. My first attempt yesterday was okay but unorthodox. Today I found an MOS:MATH section--Using Latex Markup--with a better solution, and I made changes consistent with that MOS section. - Great article btw - I'm a Class D player trying to learn more. ;o) Mark D Worthen PsyD 04:21, 30 April 2014 (UTC)

Effect of number of games played on strict zero sum

Because the number of games played by each player effects the gain and loss of each, points are not strictly transfered from one player to another. A new number is created for each player, though it has the mystical quality of being approximately zero sum. 71.198.169.64 (talk) 08:18, 5 August 2014 (UTC)

Condorcet method

I know it's not "voting", but could this be considered a Condorcet method, since if Alice always beats everyone else, she is placed at the top of the ranking list? 71.167.71.87 (talk) 13:55, 9 October 2014 (UTC)

Condorcet methods are a category of voting systems while the Elo rating system is not a voting method or even a contest. A statement like "The Elo rating system is a Condorcet method" would, in my view, qualify as not even wrong since the term "Condorcet method" simply doesn't apply to rating systems. Also I should mention that even if Alice beats everyone else in the system there is no theoretical guarantee that she would top the rating list. If Bob loses only one game, to Alice, but remains a much more active player who plays lots more games and wins against everyone except Alice, his rating will creep up for every win and eventually surpass Alice's rating. Sjakkalle (Check!) 19:57, 11 October 2014 (UTC)

How is voting not a rating system? Are they not both in the category of Pairwise comparison?71.167.73.5 (talk) 15:53, 19 October 2014 (UTC)

Elections are a contest that determine a winner based on the popularity of the candidates. The rating system is not a contest, its purpose is to estimate the strength of players based on their results in tournament games. Sjakkalle (Check!) 07:27, 27 October 2014 (UTC)

ELO or Elo?

For quite a few years, the system was formerly known as the ELO system, even though it was named after professor A. Elo. Bubba73 ^{You talkin' to me?} 08:56, 25 November 2014 (UTC)

Calculating the Rating Difference table

The standard normal distribution, with mean value μ = 0 and standard deviation σ = 1, in x = 2.17 equals to 98.500%. This is the first value in the 99% category. The corresponding rating in Fide table 8.1b ^[1] is 620. Therefore the standard deviation employed in the table equals to 620 / 2.17 = 2000 / 7. The Elo table contains a few irregularities. A more accurate expectation of 620 equals to 98.4997%, which falls into the 98% range. The table assigns 344 to expectation 88%. However the expectation due to the normal distribution is 88.5705%, which is clearly in the 89%. Needed are references to the construction of this table in order to explain the underlying calculation. (Clpippel (talk) 14:00, 13 January 2015 (UTC))

Discrepancy in description of K-factor

The article states that "[t]he maximum possible adjustment per game, called the K-factor, was set at K = 16 for masters and K = 32 for weaker players."

However, the formula for ratings adjustment is:

R_{A}^{\prime }=R_{A}+K(S_{A}-E_{A}).

According the formula, the possible adjustment is unbounded. Which is correct? — Preceding unsigned comment added by 67.85.33.223 (talk • contribs)

Everything is bounded. The maximum number of points you can score (

S_{A}

), and be expected to score (

E_{A}

), in a game is 1. The minimum possible score in a single game is 0. Hence,

(S_{A}-E_{A})

is limited to be between -1 and 1. Sjakkalle (Check!) 05:09, 23 April 2015 (UTC)

Discrepancies in the Mathematical Details section

I see that there are (at least) two fundamentally different implementations of the Elo rating system: one based on normal distribution and one based on the logistic distribution. The math details section does not describe which one it is using (the logistic distribution) clearly nor remain cognizant of the fact that there are two.

For that matter it doesn't state such simple things as "what are the parameters (variance in the case of the normal distribution, scale in the case of the logistic) used for distribution?" — Preceding unsigned comment added by 173.11.44.141 (talk) 20:25, 10 May 2015 (UTC)

Criticism

Didn't this page previously have a section on criticism of the Elo rating system? And if not, shouldn't it have one? I seem to remember reading a whole paragraph discussing the problems with Elo's assumptions and how none of them were accurate, throwing the whole system into question.71.48.250.231 (talk) 18:25, 2 July 2015 (UTC)

Criticism sections, while not absolutely forbidden, are generally avoided if we can avoid it since they can make it more difficult to maintain a neutral point of view. In general, it is considered better to note the critical opinions that have been put forward in other, more neutrally worded sections. There is a general discussion on the topic at Wikipedia:Criticism. If you look in this article on "practical issues" you will find sentences that deal with problems with the rating system, such as inflation and deflation. Sjakkalle (Check!) 18:42, 3 July 2015 (UTC)

As someone who reads the one-star reviews first to get the real story on a movie, book, product, or place, I can't say I agree with Wikipedia's view on the value of criticism if they would rather hide it in the content of the article. Nor do I agree that a section on criticism cannot be neutrally presented. (So-and-so says this is a load of hogwash, but So-and-so lost a great deal of prestige in the community as a result of these studies, so he may be biased.) But it's not worth the effort to try to convince Wikipedia's editors that mine is the right opinion. 71.48.250.231 (talk) 09:46, 5 July 2015 (UTC)

"Hide it"? That is a misunderstanding of what a neutral point of view entails. The article should neither criticize nor advocate; rather, it should summarize what has been said about the subject. Naturally that includes the spectrum of extant material, whether complimentary or critical. --Ring Cinema (talk) 22:01, 5 July 2015 (UTC)

ELO scrabble ratings

does ELO online scrabble rating hinge on how much one wins/loses by or just whether it's a win/loss?68.200.81.176 (talk) 15:58, 16 July 2015 (UTC)

External links modified

Hello fellow Wikipedians,

I have just added archive links to one external link on Elo rating system. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

Added archive https://web.archive.org/20131205025157/http://www.sjakk.no:80/NSF/elosystem_index.html to http://www.sjakk.no/nsf/elosystem_index.html

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers. —^{cyberbot II}_{Talk to my owner:Online} 22:39, 28 August 2015 (UTC)

Dubious

Pacerier (talk) 01:32, 30 June 2015 (UTC): ❝

The page states "Three votes are cast for each photograph and an Elo score is determined for all photographs". Does this even make sense? There are only so much variations you can have with three votes and two options.

Let's take an example:

Photo 1: Keep, Throw, Throw

Photo 2: Keep, Keep, Throw

Photo 3: Keep, Throw, Keep,

Photo 4: Keep, Keep, Keep

Photo 5: Throw, Throw, Throw

Photo 6: Keep, Throw, Throw

Photo 7: Keep, Keep, Throw

Photo 8: Keep, Throw, Keep,

Photo 9: Keep, Keep, Keep

What would the "Elo score" for the photographs be?

❞

Intended is the following: any two photo's A and B are compared three times. For example photograph A gets Keep, Throw, Throw. Then photograph B receives Throw, Keep, Keep. Assuming Keep = 1, and Throw = 0, photo B beats photo A by 2 to 1. After a "sufficient" number of comparisons, the total score and the Elo score can be determined in a meaningful way. Clpippel (talk) 13:24, 1 January 2016 (UTC)

External links modified

Hello fellow Wikipedians,

I have just added archive links to one external link on Elo rating system. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

Added archive https://web.archive.org/20130308183043/http://www.psy.lu.se/o.o.i.s/15545 to http://www.psy.lu.se/o.o.i.s/15545

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—^{cyberbot II}_{Talk to my owner:Online} 11:18, 2 January 2016 (UTC)

Significance of range of ratings for a game?

Is there anything useful to be said (in this article) about the range of ratings, in particular the highest rating achieved, for a given game? I see that the record chess rating achieved is 2882 by Magnus Carlsen in 2014(List_of_chess_players_by_peak_FIDE_rating), while I see in the paper on Alpha Go that top ranked (9 dan professional) Go players are ranked around 3500. Does this permit or suggest any conclusions about the difference in the games, or in the level to which they are played? PJTraill (talk) 23:03, 31 January 2016 (UTC)

No it does not. The KNDB rating of the strongest Draughts player is 1579. These absolute numbers are meaningless. It is about the rating differences at the same point in time, and in the same rating pool. From the record chess rating list, we can derive that Carlson is 31 rating points stronger then Caruana, which gives him, according to the Fide table, an expected score advantage of 53% / 47%.

Is Carlson stronger then Fischer? To make an educated guess, we have to consider the rating differences of Fischer and his competitors, and compare that with the rating differences between Carlson and his competitors. IJmuiden, Clpippel (talk) 18:28, 2 February 2016 (UTC)

non commutative - is that correct?

The article claims Elo ratings are non-commutative. The only source for this is a single line in a slide presentation (and apparently a graduate student talk, so not a WP:RS). I would have thought that this was incorrect, because Elo ratings are updated in a block, month by month. Isn't it true that start-of-the-month ratings (rather than "live" ratings) are used in each month's calculation? Adpete (talk) 23:07, 29 March 2016 (UTC)

FIDE Rating Regulations effective from 1 July 2014, section 8.55(c), says, "(c) ΣΔR x K = the Rating Change for a given tournament, or Rating period." That says to me that the ratings are updated at the end of a ratings period (i.e. monthly). And therefore there are commutative, within a given ratings period. Adpete (talk) 23:18, 29 March 2016 (UTC)

(24 hours later) In the absence of any discussion, I will delete the entire section shortly. The reason being: I see no evidence of commutativity being discussed in any WP:Reliable Source, so this article should not discuss it either. Adpete (talk) 23:37, 30 March 2016 (UTC)

Hi Adpete, I'm sorry that the section has to go. Though the phenomenon seems obvious to me, I can't find any reliable source (http://blog.daave.com/2011/06/adventures-with-google-app-engine.html being a blog). Just a comment about the FIDE regulation: even if the rating update is commutative within a period, it would still not be commutative between 2 different periods. Also 24 hours is not much time, especially several Wikimania deadlines fell yesterday. Anyway, remove it if you must. Cheers, cmɢʟee⎆τaʟκ 19:38, 31 March 2016 (UTC)

The "non-commutative" part is correct on online web servers such as Internet Chess Club where ratings are updated immediately after each game. However, for ratings that matter this is generally a non-issue. For the non-commutative part to hold in the FIDE system the games between the players would have to be in different months. Also, since tournaments are not rated before they are completed the games would also have to be in separate tournaments. On the whole I support Adpete's decision since the effect of the non-commutativeness is of very minor consequence in the few cases where it is seen. Sjakkalle (Check!) 16:18, 1 April 2016 (UTC)

To me, the issue is whether it's covered in WP:Reliable Sources. I'm more than happy to have a little bit about it if RSs cover it. If no RSs cover, that means it's a mathematical curiosity which is true (for ratings which are continuously updated), but no one cares about. Adpete (talk) 11:31, 2 April 2016 (UTC)

External links modified

Hello fellow Wikipedians,

I have just modified one external link on Elo rating system. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

Added {{dead link}} tag to http://www.uschess.org/ratings/RedmanremembersRichard.pdf
Added archive https://web.archive.org/web/20131021132001/http://www.sjakk.no/nsf/elosystem_main.html to http://www.sjakk.no/nsf/elosystem_main.html

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 05:10, 23 December 2016 (UTC)

Mathematical Issues from a team of psychologists

The following line was added to the Mathematical issues section on January 19, 2013 by an anonymous user:

"A 2012 study led by a team of Swedish psychologists at Lund University has spawned arguments for the inclusion of a fourth concern, albeit theoretical: intangibles."

The same IP had made some edits the previous day some of which were reverted as clear vandalism. The citation given points to an archived web page that talks about research into social cognition with no apparent tie to ELO or rating systems. A search for the title given of "Theoretical intangibles of the ELO system" doesn't turn up anything either. So far as I can tell this is simply vandalism that got missed initially. Unless someone comes up with some sort of evidence otherwise I'll go ahead and remove it. Janzert (talk) 08:22, 30 January 2017 (UTC)

Why the Logistic Function?

The page assumes that the use of the logistic function in Elo's formula is obvious but it really isn't. Indeed its straightforward to show that Elo's assumption that the performance is normally distributed means that ratings should follow a cumulative normal distribution. The logistic function is a decent estimate of that, of course, but there is no mention of Elo's justification. — Preceding unsigned comment added by 96.242.81.223 (talk) 15:00, 19 June 2016 (UTC)

Explain exactly how Elo's original formula assumes a normal distribution.Anonywiki (talk) 15:43, 16 September 2016 (UTC)

Elo's theory is based on Pairwise comparisons. A first attempt -and subsequently forgotten- to develop a theory of chess ratings was made by Ernst Zermelo, using the New York 1924 chess tournament as an example. See Die Berechnung der Turnier-Ergebnisse als ein Maximumproblem der Wahrscheinlichkeitsrechnung, Mathematische Zeitschrift 29, 1929, S. 436–460

Elo chooses the Gaussian distribution "after extensive investigation". Elo did consider other distributions: Verhulst, Perks, rectangular and lineair, binomial and Maxwell Bolzmann. As the Elo system is self-correcting, the actual distribution is not that important, as long as the distribution is monotonous and continuous.

Clpippel (talk) 15:07, 8 November 2016 (UTC)

It is incorrect to write that FIDE still use the Gaussian distribution. FIDE regulations ^[2] show that they adopted the logistic at least as early as 2014. — Preceding unsigned comment added by Fletch1729 (talk • contribs) 20:10, 24 April 2017 (UTC)

External links modified

Hello fellow Wikipedians,

I have just modified 3 external links on Elo rating system. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

Corrected formatting/usage for http://www.sjakk.no/nsf/elosystem_main.html
Corrected formatting/usage for http://www.sjakk.no/nsf/elosystem_index.html
Added archive https://web.archive.org/web/20080603001814/http://chess.liverating.org/ to http://chess.liverating.org/

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 06:08, 27 July 2017 (UTC)

External links modified

Hello fellow Wikipedians,

I have just modified 2 external links on Elo rating system. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 18:35, 7 September 2017 (UTC)

About the origin.

Why there is no mention that this ranking system came from the Japanese Dan-I or Dankyūisei from the game Go? — Preceding unsigned comment added by TheBr0s (talk • contribs) 05:25, 17 May 2017 (UTC)

If you have a source for that you can add it yourself. Personally I am highly skeptical of your claim. MathHisSci (talk) 09:13, 9 September 2017 (UTC)

External links modified

Hello fellow Wikipedians,

I have just modified 2 external links on Elo rating system. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 23:22, 19 September 2017 (UTC)

Elo rating system in Counter strike

The Elo rating system is not and was never used in Couner-Strike: Global Offensive. It's rating System is based on a modified Glicko-2 rating system. Source: Official statement from Vitaliy from Valve. "The CS:GO competitive ranking system started with ideas based on Glicko-2 rating model and improved over time to better fit the CS:GO player base."

https://www.reddit.com/r/GlobalOffensive/comments/2g3r4c/the_ultimate_guide_to_csgo_ranking/ckfhfir/

I have removed that part from the article.

62.157.193.130 (talk) 12:47, 7 March 2018 (UTC)

Explaining a bit more the formula with 'Q' quantities

From the article.

E_{A}={\frac {Q_{A}}{Q_{A}+Q_{B}}}

It requires a bit of work to pass from the first formula to this second (this one above is more handy). This because at first one can be puzzled that the denominator cannot be broken immediately. So just to avoid trying I leave it here.

$E_{A}={\frac {Q_{A}}{Q_{A}+Q_{B}}}={\frac {10^{R_{A}/400}}{10^{R_{A}/400}+10^{R_{B}/400}}}={\frac {{\frac {1}{10^{R_{A}/400}}}\cdot 10^{R_{A}/400}}{{\frac {1}{10^{R_{A}/400}}}\cdot \left(10^{R_{A}/400}+10^{R_{B}/400}\right)}}={\frac {1}{1+10^{(R_{B}-R_{A})/400}}}$

Pier4r (talk) 10:55, 5 May 2018 (UTC)

Glickman page as EL

User:Quale would you please justify your addition of the EL here per WP:EL? As far as I can see there is no justification in that guideline, for including a link to Glickman's page listing his research papers. Jytdog (talk) 13:48, 17 July 2018 (UTC)

Glickman's publications page includes neutral and accurate material that is relevant to an encyclopedic understanding of the subject. Is there some reason that you think that it violates EL:NO? Quale (talk) 02:06, 18 July 2018 (UTC)

Elo Rating System in Rainbow Six: Siege

The article lists a bunch of games that use the Elo rating system, but it does not mention Rainbow Six: Siege, which is more popular than all of the other games listed, with CS:GO and Overwatch being exceptions. The two following links explain Elo in R6. I gave the information, now we just need someone to add it to the article (I suck at editing articles directly). https://support.ubi.com/en-us/Faqs/000024743/How-Does-Rank-Work-in-R6-Siege https://rainbow6.ubisoft.com/siege/en-us/news/152-277344-16/matchmaking-rating-and-ranks Adil3214 (talk) 16:26, 4 July 2019 (UTC)

History section needs more work

When exactly did FIDE adopt the Elo system? And were there unofficial lists before that. The reason I ask is this page [2] gives "FIDE Rating List" back to 1967, but I think it is more correct to say it was an Elo list. There are then lists in 1968, 1969, 1970, two in 1971, then one each in 1972-74. I suspect the July 1971 one is the first official FIDE list. Whatever the case is, the article should make this clear. Adpete (talk) 04:30, 28 August 2019 (UTC)

And perhaps even some pre-Elo history, like here: http://www.chesshistory.com/winter/extra/ratingstitles.html Adpete (talk) 05:22, 28 August 2019 (UTC)

Also see Chess rating system. Bruce leverett (talk) 02:22, 10 February 2020 (UTC)

ELO inflation has been stopped

is there some explanation for this ?

Electric Light Orchestra ratings have stabilized because the rate of increase in human playing strength has slowed down in the last 10 years or so. Ken Regan's 2011 study concluded that rating inflation is not a thing; the increase in the number of grandmasters and the number of >2700 players reflects actual improvements in the standard of play. MaxBrowne2 (talk) 00:30, 1 May 2021 (UTC)

Is this a quotation? If so, where is it from?

The Regan/Haworth study more or less punctures the balloon of rating inflation, and so most of the section, "Rating inflation and deflation", must be at least rewritten, if not discarded altogether. Here is a reference for that study: Regan, Kenneth; Haworth, Guy (2011-08-04). "Intrinsic Chess Ratings". Proceedings of the AAAI Conference on Artificial Intelligence. 25 (1). ISSN 2374-3468. Bruce leverett (talk) 15:44, 26 June 2021 (UTC)

Misleading games in opening paragraph

The opening paragraph while listing examples of sports that use the Elo system, while many of the games listed don't actually use Elo, instead using a modified or derivative system. Should this language perhaps be changed to "Elo and derivatives of Elo are also used in..."? Tomelena (talk) 20:09, 9 August 2021 (UTC)

It would make sense to change that wording; thanks. Bruce leverett (talk) 02:37, 10 August 2021 (UTC)

Footnote

@Bruce leverett: Even if the mention of the name not being not an acronym was warranted, should not be written as "ELO" nor pronounced as /ˌiːɛlˈoʊ/ is completely redundant because, if something is not an initialism, it goes without saying that it shouldn't be written or pronounced like one, especially when the article already uses the word and describes how it's pronounced. And yes, should is unencyclopedic instructional language discouraged in MOS:INSTRUCT as it addresses the reader and presupposes writing or pronouncing it like an initialism is something they might plausibly do.

But I don't think The name is not an initialism is necessary either. No one would think it is after reading "In English, this is usually pronounced as /ˈiːloʊ/ or /ˈɛloʊ/", "The original name Élő is pronounced [ˈeːløː] ^ⓘ in Hungarian", or "It is named after its creator Arpad Elo, a Hungarian-American physics professor." As the MoS says, Do not tell readers that something is ironic, surprising, unexpected, amusing, coincidental, etc. Simply state the sourced facts and allow readers to draw their own conclusions. Nardog (talk) 02:06, 30 September 2021 (UTC)

If "should not" is unsatisfactory, then would "is not" be an acceptable substitute?

I agree that words along the lines of ironic, surprising, etc., would be inappropriate.

It is indeed redundant to say that the word is not an initialism, while also saying that the rating system is named for its creator, Arpad Elo. But it is (I think) OK to say one thing in the article, and a redundant thing in the note.

It is a widespread error, both in chess literature and non-chess literature, to use "ELO" instead of "Elo" to refer to these rating systems. It is natural for someone to look in Wikipedia to check whether or not it's an "initialism" (I would normally call them acronyms, but whatever). The fully literate reader can deduce the right answer from the fact that it's a person's name, but the additional footnote is an attempt to, so to speak, underline the point. Bruce leverett (talk) 02:25, 30 September 2021 (UTC)

It doesn't take a "fully literate" reader to understand that it's pronounced /ˈiːloʊ/ or /ˈɛloʊ/ and therefore it's not pronounced /ˌiːɛlˈoʊ/, or that it's a personal name of Hungarian origin and therefore it's not an acronym. It is not our job to correct errors; we simply state verifiable facts (WP:RGW). That said, if you could find a reliable source that says this is indeed a common occurrence and it should be considered an error, then that would be worth including. Otherwise, while "is not" is indeed at least better than "should not", I find the whole sentence redundant and unsuitable for this encyclopedia. Nardog (talk) 02:35, 30 September 2021 (UTC)

You quite often see threads like this where people are mocked for talking about the "Electric Light Orchestra" rating system. MaxBrowne2 (talk) 03:26, 30 September 2021 (UTC)

FIDE Categories

FIDE title regulations don't mention tournament categories any more. Already in the 2010 regulations, it was noted that categories had been deprecated: "Category was used for title results earlier. Now it is used only to describe the overall strength of a round-robin tournament." In the 2014, 2017, and 2022 regulations, they aren't mentioned at all. I am not sure we should even mention them in this article (will think about that). If we do, we should mention that they are no longer described in the FIDE regulations and are not used. Bruce leverett (talk) 16:41, 13 January 2022 (UTC)

Inflation in FIDE ratings

There is no relationship between strength and rating increase. Elo rating is differential. So if FIDE decides to lower the Elo rating of all its members in the pool by 100, the scoring probabilities between players remain the same. As more rating points are added, the average rating per player in the pool will increase.

Rating development chess world champions — Chess Fide Ratings

The graph shows the expectation of the champion's rating from the FIDE rating list, compared to the ratings of the following 19 players. Robert Fischer's superiority is undeniable. The line represents the average rating development. The rating inflation since 1985 is clear.

Magnus Carlsen's rating as of February 2022 is 2865, and the average rating difference with the next 19 players is 102 (64%). Rating inflation have stabilized.

KP (talk) 11:11, 29 March 2022 (UTC)

At the risk of telling you something you already know, Wikipedia has a firm policy against publication of original research: WP:No original research. The chart you have presented above, and your discussion of it, cannot be included in this article, for that reason. Bruce leverett (talk) 15:45, 29 March 2022 (UTC)

I accidentally uploaded the file as my own work. I corrected that.

Much of the inflation/deflation discussion misses the point: that the average rating level is the result of management (or lack thereof) by the rating authority. It would be very helpful if FIDE published statistics on the rating pool.

See (Elo, 1978), 3.7 Deflation Control Processes, 3.8, Monitoring the Rating Pool.

2001:1C04:5032:CC00:CD9B:1DD8:8CBF:92EB (talk) 18:53, 29 March 2022 (UTC)

Base 10 or base ${\sqrt {10}}$

In The Rating of Chessplayers, 1978, page 141 we read at 8.43:

When the logarithms in equation (38) are taken to the base ${\sqrt {10}}$ ,
then the Verhulst and the logistic take the following forms:
${P(D)}={\frac {1}{1+10^{-D/2C}}}$ (46)

And subsequently in chapter 8.46 Percentage Expectancy Table, page 143 the table does have the header:

Logistic Probabilities to base ${\sqrt {10}}$

Based on the above sources, ${\sqrt {10}}$ seems to be more correct then 10.

Consider:

${P(D)}={\frac {1}{1+e^{-D/C}}}$

Elo (1978, page 141) observed:

In a fortuitous numerical relation, the base ${\sqrt {10}}=3.1623$ is very close to 3.1701, the odds for a rating difference of one class interval in a system based on the normal distribution.

When base e is replaced by ${\sqrt {10}}$ (= $10^{1/2}$ ) then we arrive at equation (46).

KP (talk) 10:59, 1 April 2022 (UTC)

Thanks. There is a link underneath "base 10" to Common logarithm. If it is appropriate to change "base 10" to "base √10", then that link should not be used. Bruce leverett (talk) 16:24, 1 April 2022 (UTC)

The meaning of the K factor

The K-factor determines how we weigh past and present. A low K gives more weight to past performances. (Elo, 1978, p. 25). In the following example we show the effect of the K-factor on the rating development in more detail. The K factor in this example is set to 32.

The new rating (Rn) is approximately equal to the rating performance (Rp) calculated over the games played in the rating period (N = 5) supplemented by (800/K - N) = 20 fictitious draws against own rating.

Let N = 5, Ne = 20, N + Ne = 25, K = 4C / 25, R0 = 1613.

The class interval C = 200 is rooted in tradition (Elo, 1978, p. 19). In this example, the past weighs 4 times as much as the present. If K is set to 10 then the ratio between present and past becomes 1 to 15.

K-factor example: Rn ≈≈ Rp
	R0 =1613				Elo	Elo	800	Lin	A400
Pl	Rc	N	W	D	P(D)	We	P(D)	We	Rp
S0	1613	20	10	0	50,0%	10,00	50,0%	10,00	0
S1	1609	1	½	4	50,6%	0,51	50,5%	0,51	0
S2	1477	1	½	136	68,3%	0,68	67,0%	0,67	0
S3	1388	1	1	225	78,5%	0,78	78,1%	0,78	400
S4	1586	1	1	27	53,8%	0,54	53,4%	0,53	400
S5	1720	1	0	-107	35,4%	0,35	36,6%	0,37	-400
	1601,600	25	13,0			12,865		12,856	400/25
		P =	52,00%					P =	52,00%
		D(P) =	14,330					D800(P) =	16

We note the following:

Rn_Elo = 1617,33 = 1613 + 32(13,000 – 12,8647)
Rn_P800 = 1617,60 = 1613 + 32(13,000 – 12,8563)
Rp_D800 = 1617,60 = 1601,600 + 16
Rp_Elo = 1615,93 = 1601,600 + 14,330

As the expectation curve between -300 and +300 is almost linear with slope 1 on 4C we have:

Rn_Elo ≈≈ Rn_D800 == Rp_D800 ≈≈ Rp_Elo, calculated over the above 25 games.

Percentage Expectancy Curve

P(D) = norm.dist(D, 0, 2000 / 7, cumulative), (Elo, 1978, p.28)

Linear Approximation Formulae ( "algorithm of 400")

P800(D) = D / 4C + 50%, slope = 1 / 4C, intercept = 50%.

D800(P) = (P - 50%)4C.

KP (talk) 12:32, 8 April 2022 (UTC)

Roger Cook

No mention of Roger Cook? Eroica (talk) 16:14, 30 May 2022 (UTC)

Here is something I found that explains his significance. It would be better to cite something that looks more like a reliable source. Since this quotes the introduction to Elo's report to FIDE, it would be preferable to cite that report, if one can dig it up; or do you have another source in mind? Bruce leverett (talk) 20:57, 30 May 2022 (UTC)

The obvious place to look is in The Rating of Chess Players Past & Present (Elo 1978). I didn't remember it, but checking the index I see the book does mention Roger Cook on page 14.

"It was with this thought in mind that the writer in 1959 undertook to examine the logic and rationale of the rating systems then in use and to develop a system based on statistical and probability theory. Quite independently and almost at the same time, György Karoly and Roger Cook developed a system based on the same principles for the New South Wales Chess Association (Cook and Hooper 1969)."

This doesn't agree precisely with chess-poster, although they don't necessarily conflict directly. The Elo bibliography indicates that Cook and Hooper 1969 is Cook, R. and Hooper, F., N.S.W.C.A. Grading System, privately printed pamphlet, 1969. I didn't have any luck finding any mention of it online, but this isn't surprising. Also there is this: https://www.reddit.com/r/chess/comments/v1dn9k/roger_cook_passed_away_actual_inventor_of_the_elo/. Quale (talk) 04:29, 1 June 2022 (UTC)

Thanks for fixing this. There was a problem with the reference name "AEE1986" appearing twice with different contents, so I removed the name from the new citation.

I notice that all the other citations of Elo in this article specify the 1986 edition. I have found a .pdf of that online at [3]. I see that the credit to Karolyi and Cook is now on page 4. I think I will convert the citation to use 1986 instead of 1978, and 4 instead of 16, if you don't mind. Bruce leverett (talk) 05:26, 1 June 2022 (UTC)

At first I was going to suggest that I could change the citations to the original 1978 edition, but looking at Elo's preface to the second edition I think your idea is better. Thanks. Quale (talk) 05:52, 1 June 2022 (UTC)

Who invented the K-factor?

The development of the Elo system can be followed here: Digitaal archief Chess Life.

Chess Life, March 5 & April 5, 1960
Chess Life, June 1961. pp. 160-1
A. E. Elo; Analytical Reports, 1961, 1963 & 1965, (Privately Printed)
A. E. Elo; Historical Rating List, Chess Life April 1964
Chess Life, July & August 1967

It would be interesting to uncover the reports from 1961, 1963 and 1965. KP (talk) 12:51, 26 July 2022 (UTC)

Figure 2

shouldn't the solid and the dotted black lines be the other way around in figure 2? CwelTHC (talk) 17:11, 1 May 2023 (UTC)

or rather the rating change values be flipped around to account for greater changes in unexpected outcomes CwelTHC (talk) 17:14, 1 May 2023 (UTC)

by "figure 2" i guess you mean the one with caption beginning "Graphs of probabilities and Elo rating changes…"? —Tamfang (talk) 20:58, 21 May 2023 (UTC)

The development of the Percentage Expectancy Table

   D        P    |    D       P    |    D        P
Rtg.Dif   H   L  | Rtg Dif  H   L  | Rtg Dif   H   L   
  0-3    .50 .50 | 122-129 .67 .33 | 279-290  .84 .16 
  4-10   .51 .49 | 130-137 .68 .32 | 291-302  .85 .15  
 11-17   .52 .48 | 138-145 .69 .31 | 303-315  .86 .14  
 18-25   .53 .47 | 146-153 .70 .30 | 316-328  .87 .13  
 
 26-32   .54 .46 | 154-162 .71 .29 | 329-344  .88 .12    
 33-39   .55 .45 | 163-170 .72 .28 | 345-357  .89 .11   
 40-46   .56 .44 | 171-179 .73 .27 | 358-374  .90 .10   
 47-53   .57 .43 | 180-188 .74 .26 | 375-391  .91 .09
 
 54-61   .58 .42 | 189-197 .75 .25 | 392-411  .92 .08   
 62-68   .59 .41 | 198-206 .76 .24 | 412-432  .93 .07    
 69-76   .60 .40 | 207-215 .77 .23 | 433-456  .94 .06    
 77-83   .61 .39 | 216-225 .78 .22 | 457-484  .95 .05 
 
 84-91   .62 .38 | 226-235 .79 .21 | 485-517  .96 .04    
 92-98   .63 .37 | 236-245 .80 .20 | 518-559  .97 .03 
 99-106  .64 .36 | 246-256 .81 .19 | 560-619  .98 .02
 107-113 .65 .35 | 257-267 .82 .18 | 620-735  .99 .01
 114-121 .66 .34 | 268-278 .83 .17 | over735 1.00 .00

It is easy to verify the table is actually built with standard deviation 2000/7 as an approximation for 200√2.

A citation to a detailed description of the construction of the table by Elo is desirable.

The construction by pencil and paper is not trivial and prone to errors.

Since the normal curve flattens, one would expect that the number of rating differences within one percentage row does not decrease. However, there is a discontinuity at D = 345-357, P = 89%. This suggests that the table may contain irregularities

Rtg Dif Range                
279 290 11
291 302 11  
303 315 12
316 328 12
329 344 15
345 357 12 <---
358 374 16
375 391 16  
392 411 19

KP (talk) 08:11, 4 August 2023 (UTC)

^ "FIDE Online. FIDE Rating Regulations effective from 1 July 2014". Fide.com. 2014-07-01. Retrieved 2014-07-01.
^ https://www.fide.com/fide/handbook.html?id=172&view=article

[FideRatingTables8.1-1] "FIDE Online. FIDE Rating Regulations effective from 1 July 2014". Fide.com. 2014-07-01. Retrieved 2014-07-01.

[2] ttps://www.fide.com/fide/handbook.html?id=172&view=article

[1]

[2]

College ranking by Elo points

Color

Contradictory Example

League of Legends

Confusion About The Confusion

ICC

Focus of the article is wrong

Pronunciation

The counterexample of "Performance can't be measured absolutely"

How many games before an Elo rating

Why isn't Bradley-Terry mentioned?

The Social Network

The math behind the 400 rule

ELO inflation

Deflation explained

League of Legends Elo usage.

Starcraft 2's Use of Elo

Lead-in

Pronunciation...

Chess photo

How the normal/logistic distribution effects？

Jeff Sonas

References

"Ratings inflation"

Tweaks to improve readability of math formulae section.

Effect of number of games played on strict zero sum

Condorcet method

ELO or Elo?

Calculating the Rating Difference table

Discrepancy in description of K-factor

Discrepancies in the Mathematical Details section

Criticism

ELO scrabble ratings

External links modified

Dubious

External links modified

Significance of range of ratings for a game?

non commutative - is that correct?

External links modified

Mathematical Issues from a team of psychologists

Why the Logistic Function?

External links modified

External links modified

About the origin.

External links modified

Elo rating system in Counter strike

Explaining a bit more the formula with 'Q' quantities

Glickman page as EL

Elo Rating System in Rainbow Six: Siege

History section needs more work

ELO inflation has been stopped

Misleading games in opening paragraph

Footnote

FIDE Categories

Inflation in FIDE ratings

Base 10 or base 10 {\displaystyle {\sqrt {10}}}

Consider:

The meaning of the K factor

Roger Cook

Who invented the K-factor?

Figure 2

The development of the Percentage Expectancy Table

Base 10 or base ${\sqrt {10}}$