Difference between revisions of "Playing Strength"

From Chessprogramming wiki
Jump to: navigation, search
Line 264: Line 264:
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71419 best way to determine elos of a group] by [[Daniel Shawul]], [[CCC]], July 30, 2019
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71419 best way to determine elos of a group] by [[Daniel Shawul]], [[CCC]], July 30, 2019
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71639 Dedicated Chess Machine Elo vs Human Elo, a least squares analysis] by JayRod, [[CCC]], August 23, 2019 » [[Dedicated Chess Computers]]
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71639 Dedicated Chess Machine Elo vs Human Elo, a least squares analysis] by JayRod, [[CCC]], August 23, 2019 » [[Dedicated Chess Computers]]
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=72140 UCI Win/Draw/Loss reporting] by [[Gian-Carlo Pascutto]], [[CCC]], October 22, 2019 » [[UCI]], [[Pawn Advantage, Win Percentage, and Elo]]
 
 
==2020 ...==
 
==2020 ...==
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74037 Stockfish_dev is probably stronger than Sargon 1978 v1.00] by [[Kai Laskos]], [[CCC]], May 29, 2020 » [[Stockfish]], [[Sargon]]
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74037 Stockfish_dev is probably stronger than Sargon 1978 v1.00] by [[Kai Laskos]], [[CCC]], May 29, 2020 » [[Stockfish]], [[Sargon]]
* [https://lczero.org/blog/2020/04/wdl-head/ Win-Draw-Loss evaluation] by [[Alexander Lyashuk|crem]], [[Leela Chess Zero|LCZero blog]], April 20, 2020 » [[TCEC Season 17#Superfinal|TCEC Season 17 Superfinal]], » [[Pawn Advantage, Win Percentage, and Elo]]
 
 
* [https://www.hiarcs.net/forums/viewtopic.php?t=10004 Ply versus ELO] by Greg, [[Computer Chess Forums|HIARCS Forum]], May 30, 2020 » [[Diogo R. Ferreira#Impact|Diogo R. Ferreira - Impact of the Search Depth ...]]
 
* [https://www.hiarcs.net/forums/viewtopic.php?t=10004 Ply versus ELO] by Greg, [[Computer Chess Forums|HIARCS Forum]], May 30, 2020 » [[Diogo R. Ferreira#Impact|Diogo R. Ferreira - Impact of the Search Depth ...]]
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74319 Throwing out draws to calculate Elo] by [[Dann Corbit]], [[CCC]], June 29, 2020
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74319 Throwing out draws to calculate Elo] by [[Dann Corbit]], [[CCC]], June 29, 2020
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74339 Stockfish has included WDL stats in engine output] by Deberger, [[CCC]], July 02, 2020 » [[Stockfish]], [[Pawn Advantage, Win Percentage, and Elo]]
 
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75205 How many Elo points is a book?] by [[Chris Whittington]], [[CCC]], September 25, 2020 » [[Opening Book]]
 
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75205 How many Elo points is a book?] by [[Chris Whittington]], [[CCC]], September 25, 2020 » [[Opening Book]]
 
* [https://prodeo.actieforum.com/t123-controlled-randomness-of-evaluation-function Controlled randomness of evaluation function] by [[Pawel Koziol|nescitus]], [[Computer Chess Forums|ProDeo Forum]], December 06, 2020
 
* [https://prodeo.actieforum.com/t123-controlled-randomness-of-evaluation-function Controlled randomness of evaluation function] by [[Pawel Koziol|nescitus]], [[Computer Chess Forums|ProDeo Forum]], December 06, 2020

Revision as of 09:24, 11 April 2021

Home * Chess * Playing Strength

Dragon and Snake on ambos [1]

Playing Strength, (Performance, Skill Level)
of a chess player, or chess playing entity, program or engine, reflects the ability to win against other players, given by a number or other element of an ordered set such as an Elo number.

The ability to solve test-positions, that is, finding the specified, likely one and only best move, might be an indicator for various particular engine skills, but does not necessarily correlate with playing strength. In his Parallelism and Selectivity in Game Tree Search lecture, Tord Romstad introduced the Worst Moves Observation (WMO), which states the practical playing strength is not primarily determined by the quality of the players best moves nor average moves, but by the quality of the players worst moves.

Measuring

A statistical valid method to measure playing strength within a defined confidence interval is to play an appropriate huge number of games with both sides versus a wide range of different opponents [2] with symmetric time constraints, and to apply match statistics. Performance isn't measured absolutely; it is inferred from wins, losses, and draws against other players or engines. Players' rating depend on the ratings of their opponents, and the results scored against them [3]. While relative playing strength of chess engines is not strictly transmissive over various time controls, the number of games played is more relevant than their duration, the todays de facto standard in measuring playing strength is parallel playing fast chess with (ultra) short time control, such as blitz, bullet or even lightning chess, as for instance used in the Fishtest framework of Stockfish [4].

Strength

The strength of a chess program depends on many things, the quality and efficiency of the algorithms involved to determine the best move of a position, the balance of the so called search versus knowledge tradeoff to evaluate or compare leaf nodes of a search tree, how to shape that tree and to propagate a score up to the root, and time management, that is how to allocate time for searching a move under time control requirements. Time used is roughly proportional to the number of visited nodes of the common depth-first search inside an iterative deepening frame, which grows exponentially by its effective branching factor raised to the power of search depth. Playing strength might be improved over the (playing) time due to learning algorithms.

Computer Analysis of Human Players

[5]

See also

Publications

1970 ...

1980 ...

1990 ...

2000 ...

2005 ...

2010 ...

2015 ...

Postings

1982 ...

1990 ...

1995 ...

2000 ...

2005 ...

Re: A thought about ratings by Don Dailey, Computer Go Archive, December 10, 2007
Re: A thought about ratings by Edward de Grijs, Computer Go Archive, December 10, 2007
Re: A thought about ratings by Don Dailey, Computer Go Archive, December 10, 2007

2008

2009

2010 ...

2011

2012

2013

2014

2015 ...

2016

About expected scores and draw ratios by Jesús Muñoz, CCC, September 17, 2016

2017

Re: "Intrinsic Chess Ratings" by Regan, Haworth -- by Kenneth Regan, CCC, November 20, 2017 » Who is the Master?

2018

2019

2020 ...

2021

External Links

Chess Player

Chess Engines

Analysis

Rating Systems

Misc

References

  1. Photo by Gerd Isenberg, September 18, 2016, detail of the Flottmann gate, Art Nouveau theme of Dragon and Sun designed by Carl Weinhold, art director of blacksmith and foundry Füssmann und Fleeth, Essen, exposed at the industrial and trade exhibition 1902 in Düsseldorf, and baught by Heinrich Flottmann as gate for his jackhammer factory, today adjacent to the exhibition and event space Flottmann-Hallen in Herne, North Rhine-Westphalia, Germany, and part of The Industrial Heritage Trail of the Ruhr area, "The dragon is a symbol of physical strength and intelligence with respect to the snake that symbolizes the tough, glowing wrought iron" from Flottmann-Tor – Hün un Perdün, see also Image by Gerd Biedermann
  2. A word for casual testers by Don Dailey, CCC, December 25, 2012
  3. Elo rating system - Mathematical details - Wikipedia
  4. Stockfish Testing Framework
  5. Comparison of top chess players throughout history from Wikipedia
  6. Elo's Book: The Rating of Chess Players by Sam Sloan
  7. Computers choose: who was the strongest player?, ChessBase News, October 30, 2006
  8. Computer analysis of world champions by Søren Riis, ChessBase News, November 02, 2006
  9. Bayesian inference from Wikipedia
  10. How I did it: Diogo Ferreira on 4th place in Elo chess ratings competition | no free hunch
  11. "Intrinsic Chess Ratings" by Regan, Haworth -- seq by Kai Middleton, CCC, November 19, 2017
  12. Re: EloStat, Bayeselo and Ordo by Rémi Coulom, CCC, June 25, 2012
  13. Re: Understanding and Pushing the Limits of the Elo Rating Algorithm by Daniel Shawul, CCC, October 15, 2019
  14. Ply versus ELO by Greg, HIARCS Forum, May 30, 2020 » Diogo R. Ferreira - Impact of the Search Depth ...
  15. Questions regarding rating systems of humans and engines by Erik Varend, CCC, December 06, 2014
  16. chess statistics scientific article by Nuno Sousa, CCC, July 06, 2016
  17. Personal bests – and why you stop after achieving them: U of T Scarborough expert | University of Toronto Scarborough - News and Events by Nina Haikara, February 7, 2018
  18. Understanding and Pushing the Limits of the Elo Rating Algorithm by Michel Van den Bergh, CCC, October 15, 2019
  19. Delphil 3.3b2 (2334) - Stockfish 030916 (3228), TCEC Season 9 - Rapid, Round 11, September 16, 2016
  20. Normalized Elo (pdf) by Michel Van den Bergh
  21. Chessanalysis homepage by Erik Varend
  22. wall - Wiktionary
  23. regression - Wiktionary
  24. Most accurate K-factor - Elo rating system from Wikipedia
  25. FIDE Chess Rating calculators: Chess Rating change calculator
  26. The primary data source is the SSDF, Deep Blue Elo Performance: estimated by perf. against Kasparov, AlphaZero Elo Performance: estimated by perf. against Stockfish, Leela Chess Zero Elo Performance from LCZ rating list (CCRL estimate)
  27. Matej Guid, Ivan Bratko (2006). Computer Analysis of World Chess Champions. ICGA Journal, Vol. 29, No. 2, pdf
  28. an interesting study from Erik Varend by scandien, Hiarcs Forum, August 13, 2017

Up one Level