Changes

Jump to: navigation, search

Match Statistics

72,153 bytes added, 18:00, 23 May 2018
Created page with "'''Home * Engine Testing * Match Statistics''' FILE:MatchStatistics.jpg|border|right|thumb|link=https://commons.wikimedia.org/wiki/File:Standard_deviation..."
'''[[Main Page|Home]] * [[Engine Testing]] * Match Statistics'''

[[FILE:MatchStatistics.jpg|border|right|thumb|link=https://commons.wikimedia.org/wiki/File:Standard_deviation_diagram.svg| Match Statistics <ref>Image based on [https://commons.wikimedia.org/wiki/File:Standard_deviation_diagram.svg Standard deviation diagram] by [https://commons.wikimedia.org/wiki/User:Mwtoews Mwtoews], April 7, 2007 with [https://en.wikipedia.org/wiki/R_(programming_language) R code] given, [https://creativecommons.org/licenses/by/2.5/deed.en CC BY 2.5], [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons], [https://en.wikipedia.org/wiki/Normal_distribution Normal distribution from Wikipedia]</ref> ]]

'''Match Statistics''',<br/>
the [https://en.wikipedia.org/wiki/Statistics statistics] of chess [[Tournaments|tournaments and matches]], that is a collection of [[Chess Game|chess games]] and the presentation, [https://en.wikipedia.org/wiki/Analysis analysis], and interpretation of game related data, most common game results to determine the relative [[Playing Strength|playing strength]] of chess playing entities, here with focus on [[Engines|chess engines]]. To apply match statistics, beside considering [https://en.wikipedia.org/wiki/Statistical_population statistical population], it is conventional to hypothesize a [https://en.wikipedia.org/wiki/Statistical_model statistical model] describing a set of [https://en.wikipedia.org/wiki/Probability_distribution probability distributions].

=Ratios / Operating Figures=
Common tools, ratios and figures to illustrate a tournament outcome and provide a base for its interpretation.

==Number of games==
The total number of games played by an engine in a tournament.
<span style="font-size:140%;">N = wins + draws + losses</span>
==Score==
The score is a representation of the tournament-outcome from the viewpoint of a certain engine.
<span style="font-size:140%;">score_difference = wins - losses</span>

<span style="font-size:140%;">score = wins + draws/2</span>
<span id="ratio"></span>
==Win & Draw Ratio==
<span style="font-size:140%;">win_ratio = score/N</span>

<span style="font-size:140%;">draw_ratio = draws/N</span>
These two ratios depend on the [[Playing Strength|strength]] difference between the competitors, the average strength level, the color and the drawishness of the [[Opening Book|opening book-line]]. Due to the second reason given, these ratios are very much influenced by the [[Time Management#Time%20Controls|timecontrol]], what is also confirmed by the published statistics of the testing orgnisations [[CCRL]] and [[CEGT]], showing an increase of the [[Draw|draw]] rate at longer time controls. This correlation was also shown by [[Kirill Kryukov]], who was analyzing statistics of his test-games <ref>[http://kirill-kryukov.com/chess/kcec/draw_rate.html Kirr's Chess Engine Comparison KCEC - Draw rate] » [[KCEC]]</ref> . The program playing white seems to be more supported by the additional level of strength. So, although one would expect with increasing draw rates the win ratio to approach 50%, in fact it is remaining about equal.
{| class="wikitable"
|-
! Timecontrol
! Draw Ratio
! Win Ratio (white)
! Source
|-
| style="text-align:center;" | 40/4
| style="text-align:center;" | 30.9%
| style="text-align:center;" | 55.0%
| style="text-align:center;" | CEGT
|-
| style="text-align:center;" | 40/20
| style="text-align:center;" | 35.6%
| style="text-align:center;" | 54.6%
| style="text-align:center;" | CEGT
|-
| style="text-align:center;" | 40/120
| style="text-align:center;" | 41.3%
| style="text-align:center;" | 55.4%
| style="text-align:center;" | CEGT
|-
| style="text-align:center;" | 40/120 (4cpu)
| style="text-align:center;" | 45.2%
| style="text-align:center;" | 55.9%
| style="text-align:center;" | CEGT
|}

{| class="wikitable"
|-
! Timecontrol
! Draw Ratio
! Win Ratio (white)
! Source
|-
| style="text-align:center;" | 40/4
| style="text-align:center;" | 31.0%
| style="text-align:center;" | 54.1%
| style="text-align:center;" | CCRL
|-
| style="text-align:center;" | 40/40
| style="text-align:center;" | 37.2%
| style="text-align:center;" | 54.6%
| style="text-align:center;" | CCRL
|}
<span id="DoublingTC"></span>
'''Doubling Time Control'''
As posted in October 2016 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61784 Doubling of time control] by [[Andreas Strangmüller]], [[CCC]], October 21, 2016</ref> , [[Andreas Strangmüller]] conducted an experiment with [[Komodo|Komodo 9.3]], [[Time Management|time control]] doubling matches under [[Cutechess-cli]], playing 3000 games with 1500 [[Opening|opening]] positions each, without [[Pondering|pondering]], [[Learning|learning]], and [[Endgame Tablebases|tablebases]], [https://en.wikipedia.org/wiki/List_of_Intel_Core_i5_microprocessors#Desktop_processors Intel i5-750] @ 3.5 GHz, 1 Core, 128 MB Hash <ref>[http://fastgm.de/K93-Doubling-TC.pdf K93-Doubling-TC.pdf]</ref> , see also [[Kai Laskos|Kai Laskos']] 2013 results with [[Houdini|Houdini 3]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48733 Scaling at 2x nodes (or doubling time control)] by [[Kai Laskos]], [[CCC]], July 23, 2013</ref> and [[Depth#DiminishingReturns|Diminishing Returns]]:
{| class="wikitable"
|-
! Time Control
! 2<br/>vs 1
! 20+0.2<br/>10+0.1
! 40+0.4<br/>20+0.2
! 80+0.8<br/>40+0.4
! 160+1.6<br/>80+0.8
! 320+3.2<br/>160+1.6
! 640+6.4<br/>320+3.2
! 1280+12.8<br/>640+6.4
! 2560+25.6<br/>1280+12.8
|-
! colspan="2" | Elo
| style="text-align:center;" | 144
| style="text-align:center;" | 133
| style="text-align:center;" | 112
| style="text-align:center;" | 101
| style="text-align:center;" | 93
| style="text-align:center;" | 73
| style="text-align:center;" | 59
| style="text-align:center;" | 51
|-
! colspan="2" | Win
| style="text-align:right;" | 44.97%
| style="text-align:right;" | 41.27%
| style="text-align:right;" | 36.67%
| style="text-align:right;" | 32.67%
| style="text-align:right;" | 30.47%
| style="text-align:right;" | 25.17%
| style="text-align:right;" | 21.77%
| style="text-align:right;" | 18.97%
|-
! colspan="2" | Draw
| style="text-align:right;" | 49.20%
| style="text-align:right;" | 54.00%
| style="text-align:right;" | 57.93%
| style="text-align:right;" | 63.03%
| style="text-align:right;" | 65.33%
| style="text-align:right;" | 70.47%
| style="text-align:right;" | 73.17%
| style="text-align:right;" | 76.63%
|-
! colspan="2" | Loss
| style="text-align:right;" | 5.83%
| style="text-align:right;" | 4.73%
| style="text-align:right;" | 5.40%
| style="text-align:right;" | 4.30%
| style="text-align:right;" | 4.20%
| style="text-align:right;" | 4.37%
| style="text-align:right;" | 5.07%
| style="text-align:right;" | 4.40%
|}

==Elo-Rating & Win-Probability==
''see [[Pawn Advantage, Win Percentage, and Elo]]''
<span style="font-size:140%;">Expected win_ratio, win_probability (E)</span>

<span style="font-size:140%;">Elo Rating Difference (&#916;) = Elo_Player1 - Elo_Player2</span>

<span style="font-size:140%;">E = 1 / ( 1 + 10<span style="vertical-align: super;">-&#916;/400</span>)</span>

<span style="font-size:140%;">&#916; = 400 log<span style="vertical-align: sub;">10</span>(E / (1 - E))</span>
Generalization of the Elo-Formula:
''win_probability of player i in a tournament with n players''
<span style="font-size:140%;">E<span style="vertical-align: sub;">i</span> = 10<span style="vertical-align: super;">Elo<span style="vertical-align: sub;">i</span></span> / (10<span style="vertical-align: super;">Elo<span style="vertical-align: sub;">1</span></span> + 10<span style="vertical-align: super;">Elo<span style="vertical-align: sub;">2</span></span> + ... + 10<span style="vertical-align: super;">Elo<span style="vertical-align: sub;">n-1</span></span> + 10<span style="vertical-align: super;">Elo<span style="vertical-align: sub;">n</span></span>)</span>

==Likelihood of Superiority<span id="Likelihood of superiority"></span>==
''See [[LOS Table]]''

The likelihood of superiority (LOS) denotes how likely it would be for two players of the same [[Playing Strength|strength]] to reach a certain result - in other fields called a [https://en.wikipedia.org/wiki/P-value p-value], a measure of [https://en.wikipedia.org/wiki/Statistical_significance statistical significance] of a departure from the [https://en.wikipedia.org/wiki/Null_hypothesis null hypothesis] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=64084&start=5 Re: Likelihood Of Success (LOS) in the real world] by [[Álvaro Begué]], [[CCC]], May 26, 2017</ref>. Doing this analysis after the tournament one has to differentiate between the case where one knows that a certain engine is either stronger or equally strong (directional or one-tailed test) or the case where one has no information of whether the other engine is stronger or weaker (non-directional or [https://en.wikipedia.org/wiki/Two-tailed_test two-tailed test]). The latter due to the reduced information results in larger [https://en.wikipedia.org/wiki/Confidence_interval confidence intervals].

'''Two-tailed Test'''<br/>
[https://en.wikipedia.org/wiki/Null_hypothesis Null]- and [https://en.wikipedia.org/wiki/Alternative_hypothesis alternative hypothesis]:

<span style="font-size:140%;">H<span style="vertical-align: sub;">0</span> : Elo_Player1 = Elo_Player2 </span>

<span style="font-size:140%;">H<span style="vertical-align: sub;">1</span> : Elo_Player1 &#8800; Elo_Player2</span>

<span style="font-size:140%;">LOS = P(Score > score of 2 programs with equal strength)</span>

The probability of the [https://en.wikipedia.org/wiki/Null_hypothesis null hypothesis] being true can be calculated given the tournament outcome. In other words, how likely would it be for two players of the same strength to reach a certain result. The LOS would then be the inverse, 1 - the resulting probability.

For this type of analysis the [https://en.wikipedia.org/wiki/Multinomial_distribution trinomial distribution], a generalization of the [https://en.wikipedia.org/wiki/Binomial_distribution binomial distribution], is needed. Whilest the binomial distribution can only calculate the probability to reach a certain outcome with two possible events, the trinominal distribution can account for all three possible events (win, draw, loss).

The following functions gives the probability of a certain game outcome assuming both players were of equal strength:

<span style="font-size:140%;">win_probability = (1 - draw_ratio) / 2</span>

<span style="font-size:140%;">P(wins,draws,losses) = N!/(wins! draws! losses!) win_probability<span style="vertical-align: super;">wins</span> draw_ratio<span style="vertical-align: super;">drwas</span> win_probability<span style="vertical-align: super;">losses</span></span>

This calculation becomes very inefficient for larger number of games. In this case the [https://en.wikipedia.org/wiki/Normal_distribution#Standard_normal_distribution standard normal distribution] can give a good approximation:

<span style="font-size:140%;">'''''N'''''(N/2, N(1-draw_ratio))</span>

where N(1 - draw_ratio) is the sum of wins and losses:

<span style="font-size:140%;">'''''N'''''(N/2, wins + losses)</span>

To calculate the LOS one needs the [https://en.wikipedia.org/wiki/Cumulative_distribution_function cumulative distribution function] of the given normal distribution. However, as pointed out by [[Rémi Coulom]], calculation can be done cleverly, and the normal approximation is not really required <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51003&start=5 Re: Calculating the LOS (likelihood of superiority) from results] by [[Rémi Coulom]], [[CCC]], January 23, 2014</ref> . As further emphasized by [[Kai Laskos]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51003&start=1 Re: Calculating the LOS (likelihood of superiority) from results] by [[Kai Laskos]], [[CCC]], January 22, 2014</ref> and Rémi Coulom <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=303382&t=30624 Re: Likelihood of superiority] by [[Rémi Coulom]], [[CCC]], November 15, 2009</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=303405&t=30624 Re: Likelihood of superiority] by [[Rémi Coulom]], [[CCC]], November 15, 2009</ref> , draws do not count in LOS calculation and don't make a difference whether the game results were obtained when playing Black or White. It is a good approximation when the two players played the same number of games with each color:

<span style="font-size:140%;">LOS = &#981;((wins - losses)/&#8730;(wins + losses))</span>

<span style="font-size:140%;">LOS = &#189;[1 + erf((wins - losses)/&#8730;(2wins + 2losses))]</span>

<ref>[https://en.wikipedia.org/wiki/Error_function Error function from Wikipedia]</ref> <ref>[http://pubs.opengroup.org/onlinepubs/000095399/functions/erf.html The Open Group Base Specifications Issue 6IEEE Std 1003.1, 2004 Edition: erf]</ref> <ref>[http://stackoverflow.com/questions/631629/erfx-and-math-h erf(x) and math.h] by user76293, [https://en.wikipedia.org/wiki/Stack_Overflow_%28website%29 Stack Overflow], March 10, 2009</ref>

'''One-tailed Test'''<br/>
[https://en.wikipedia.org/wiki/Null_hypothesis Null]- and [https://en.wikipedia.org/wiki/Alternative_hypothesis alternative hypothesis]:

<span style="font-size:140%;">H<span style="vertical-align: sub;">0</span> : Elo_Player1 &#8804; Elo_Player2</span>

<span style="font-size:140%;">H<span style="vertical-align: sub;">1</span> : Elo_Player1 > Elo_Player2</span>

<span id="Sample"></span>
'''Sample Program'''<br/>
A tiny [[Cpp|C++11]] program to compute Elo difference and LOS from W/L/D counts was given by [[Álvaro Begué]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51003&start=2 Re: Calculating the LOS (likelihood of superiority) from results] by [[Álvaro Begué]], [[CCC]], January 22, 2014</ref> :
<pre>
#include <cstdio>
#include <cstdlib>
#include <cmath>

int main(int argc, char **argv) {
if (argc != 4) {
std::printf("Wrong number of arguments.\n\nUsage:%s <wins> <losses> <draws>\n", argv[0]);
return 1;
}
int wins = std::atoi(argv[1]);
int losses = std::atoi(argv[2]);
int draws = std::atoi(argv[3]);

double games = wins + losses + draws;
std::printf("Number of games: %g\n", games);
double winning_fraction = (wins + 0.5*draws) / games;
std::printf("Winning fraction: %g\n", winning_fraction);
double elo_difference = -std::log(1.0/winning_fraction-1.0)*400.0/std::log(10.0);
std::printf("Elo difference: %+g\n", elo_difference);
double los = .5 + .5 * std::erf((wins-losses)/std::sqrt(2.0*(wins+losses)));
std::printf("LOS: %g\n", los);
}
</pre>

==Statistical Analysis==
'''The trinomial versus the 5-nomial model'''

As indicated above a match between two engines is usually modeled as a sequence of independent trials taken from a trinomial distribution with probabilities (win_ratio,draw_ratio,loss_ratio). This model is appropriate for a match with randomly selected opening positions and randomly assigned colors (to maintain fairness). However one may show that under reasonable elo models the trinomial model is not correct in case games are played in pairs with reversed colors (as is commonly the case) and unbalanced opening positions are used.

This was also empirically observed by [[Kai Laskos]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61105 Error margins via resampling (jackknifing)] by [[Kai Laskos]], [[CCC]], August 12, 2016</ref> . He noted that the statistical predictions of the trinomial model do not match reality very well in the case of paired games. In particular he observed that for some data sets the variance of the match score as predicted by the trinomial model greatly exceeds the variance as calculated by the [https://en.wikipedia.org/wiki/Jackknife_resampling jackknife] estimator. The jackknife estimator is a non-parametric estimator, so it does not depend on any particular statistical model. It appears the mismatch may even occur for balanced opening positions, an effect which can only be explained by the existence of correlations between paired games - something not considered by any elo model.

Over estimating the variance of the match score implies that derived quantities such as the number of games required to establish the superiority of one engine over another with a given level of significance are also over estimated. To obtain agreement between statistical predictions and actual measurements one may adopt the more general 5-nomial model. In the 5-nomial model the outcome of paired games is assumed to follow a 5-nomial distribution with probabilities

<span style="font-size:140%;">(p<span style="vertical-align: sub;">0</span>, p<span style="vertical-align: sub;">1/2</span>, p<span style="vertical-align: sub;">1</span>, p<span style="vertical-align: sub;">3/2</span>, p<span style="vertical-align: sub;">2</span>)</span>

These unknown probabilities may be estimated from the outcome frequencies of the paired games and then subsequently be used to compute an estimate for the variance of the match score. Summarizing: in the case of paired games the 5-nomial model handles the following effects correctly which the trinomial model does not:
* Unbalanced openings
* Correlations between paired games

For further discussion on the potential use of unbalanced opening positions in engine testing see the posting by [[Kai Laskos]] <ref>[http://talkchess.com/forum/viewtopic.php?t=61245 Properties of unbalanced openings using Bayeselo model] by [[Kai Laskos]], [[CCC]], August 27, 2016</ref> .

==SPRT==
The [https://en.wikipedia.org/wiki/Sequential_probability_ratio_test sequential probability ratio test] (SPRT) is a specific [https://en.wikipedia.org/wiki/Sequential_analysis sequential hypothesis test] - a statistical analysis where the [https://en.wikipedia.org/wiki/Sample_size_determination sample size] is not fixed in advance - developed by [[Mathematician#AWald|Abraham Wald]] <ref>[[Mathematician#AWald|Abraham Wald]] ('''1945'''). ''Sequential Tests of Statistical Hypotheses''. [https://en.wikipedia.org/wiki/Annals_of_Mathematical_Statistics Annals of Mathematical Statistics], Vol. 16, No. 2, [https://en.wikipedia.org/wiki/Digital_object_identifier doi]: [http://projecteuclid.org/euclid.aoms/1177731118 10.1214/aoms/1177731118]</ref> . While originally developed for use in quality control studies in the realm of manufacturing, SPRT has been formulated for use in the computerized testing of human examinees as a termination criterion <ref>[https://en.wikipedia.org/wiki/Sequential_probability_ratio_test Sequential probability ratio test from Wikipedia]</ref>. As mentioned by [[Arthur Guez]] in this 2015 Ph.D. thesis ''Sample-based Search Methods for Bayes-Adaptive Planning'' <ref>[[Arthur Guez]] ('''2015'''). ''Sample-based Search Methods for Bayes-Adaptive Planning''. Ph.D. thesis, Gatsby Computational Neuroscience Unit, [https://en.wikipedia.org/wiki/University_College_London University College London], [http://www.gatsby.ucl.ac.uk/~aguez/files/guez_phdthesis2015.pdf pdf]</ref>, [[Alan Turing]] assisted by [[Jack Good]] used a similar sequential testing technique to help decipher [https://en.wikipedia.org/wiki/Enigma_machine enigma codes] at [https://en.wikipedia.org/wiki/Bletchley_Park Bletchley Park] <ref>[[Jack Good]] ('''1979'''). ''[https://www.jstor.org/stable/2335677 Studies in the history of probability and statistics. XXXVII AM Turing’s statistical work in World War II]''. [https://en.wikipedia.org/wiki/Biometrika Biometrika], Vol. 66, No. 2</ref>. SPRT is applied in [[Stockfish]] testing to terminate self-testing series early if the result is likely outside a given elo-window <ref>[http://www.open-chess.org/viewtopic.php?f=5&t=2477 How (not) to use SPRT ?] by [[Mark Watkins|BB+]], [[Computer Chess Forums|OpenChess Forum]], October 19, 2013</ref> . In August 2016, [[Michel Van den Bergh]] posted following [[Python]] code in [[CCC]] to implement the SPRT a la [[Cutechess-cli]] or [[Stockfish#TestingFramework|Fishtest]]: <ref>[http://talkchess.com/forum/viewtopic.php?t=57465&start=19 Re: The SPRT without draw model, elo model or whatever..] by [[Michel Van den Bergh]], [[CCC]], August 18, 2016</ref> <ref>[http://hardy.uhasselt.be/Toga/GSPRT_approximation.pdf GSPRT approximation] (pdf) by [[Michel Van den Bergh]]</ref>
<pre>
from __future__ import division

import math

def LL(x):
return 1/(1+10**(-x/400))

def LLR(W,D,L,elo0,elo1):
"""
This function computes the log likelihood ratio of H0:elo_diff=elo0 versus
H1:elo_diff=elo1 under the logistic elo model

expected_score=1/(1+10**(-elo_diff/400)).

W/D/L are respectively the Win/Draw/Loss count. It is assumed that the outcomes of
the games follow a trinomial distribution with probabilities (w,d,l). Technically
this is not quite an SPRT but a so-called GSPRT as the full set of parameters (w,d,l)
cannot be derived from elo_diff, only w+(1/2)d. For a description and properties of
the GSPRT (which are very similar to those of the SPRT) see

http://stat.columbia.edu/~jcliu/paper/GSPRT_SQA3.pdf

This function uses the convenient approximation for log likelihood
ratios derived here:

http://hardy.uhasselt.be/Toga/GSPRT_approximation.pdf

The previous link also discusses how to adapt the code to the 5-nomial model
discussed above.
"""
# avoid division by zero
if W==0 or D==0 or L==0:
return 0.0
N=W+D+L
w,d,l=W/N,D/N,L/N
s=w+d/2
m2=w+d/4
var=m2-s**2
var_s=var/N
s0=LL(elo0)
s1=LL(elo1)
return (s1-s0)*(2*s-s0-s1)/var_s/2.0

def SPRT(W,D,L,elo0,elo1,alpha,beta):
"""
This function sequentially tests the hypothesis H0:elo_diff=elo0 versus
the hypothesis H1:elo_diff=elo1 for elo0<elo1. It should be called after
each game until it returns either 'H0' or 'H1' in which case the test stops
and the returned hypothesis is accepted.

alpha is the probability that H1 is accepted while H0 is true
(a false positive) and beta is the probability that H0 is accepted
while H1 is true (a false negative). W/D/L are the current win/draw/loss
counts, as before.
"""
LLR_=LLR(W,D,L,elo0,elo1)
LA=math.log(beta/(1-alpha))
LB=math.log((1-beta)/alpha)
if LLR_>LB:
return 'H1'
elif LLR_<LA:
return 'H0'
else:
return ''
</pre>
<span id="TournamentManager"></span>
=Tournament Manager=
* [[Arena]]
* [[Amoeba#TournamentManager|Amoeba Tournament Manager]]
* [[ChessGUI]]
* [[Cutechess-cli]]
* [[LittleBlitzer]]

=See also=
* [[Automated Tuning]]
* [[Bishop versus Knight#WinningPercantages|Bishop versus Knight - Winning Percentages]]
* [[Chess Server]]
* [[Depth#DiminishingReturns|Depth | Diminishing Returns]]
* [[Draw]]
* [[Engine Rating Lists]]
* [[LOS Table]]
* [[Pawn Advantage, Win Percentage, and Elo]]
* [[Playing Strength]]
* [[Search Statistics]]
* [[Time Management#Time%20Controls|Time Controls]]
* [[Jean-Marc Alliot#WhoistheMaster|Who is the Master?]]

=Publications=
==1920 ...==
* [[Ernst Zermelo]] ('''1929'''). ''Die Berechnung der Turnier-Ergebnisse als ein Maximumproblem der Wahrscheinlichkeitsrechnung''. [http://gdz.sub.uni-goettingen.de/dms/load/img/?IDDOC=82727 pdf] (German)
* [[Mathematician#AWald|Abraham Wald]] ('''1945'''). ''Sequential Tests of Statistical Hypotheses''. [https://en.wikipedia.org/wiki/Annals_of_Mathematical_Statistics Annals of Mathematical Statistics], Vol. 16, No. 2, [https://en.wikipedia.org/wiki/Digital_object_identifier doi]: [http://projecteuclid.org/euclid.aoms/1177731118 10.1214/aoms/1177731118]
* [[Mathematician#AWald|Abraham Wald]] ('''1947'''). ''Sequential Analysis''. [https://en.wikipedia.org/wiki/John_Wiley_%26_Sons John Wiley and Sons], [http://www.abebooks.com/book-search/title/sequential-analysis/author/abraham-wald/ AbeBooks]
* [[Mathematician#RABradley|Ralph A. Bradley]], [[Mathematician#METerry|Milton E. Terry]] ('''1952'''). ''[http://biomet.oxfordjournals.org/content/39/3-4/324.citation Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons]''. [https://en.wikipedia.org/wiki/Biometrika Biometrika], Vol. 39, Nos. 3/4, [https://en.wikipedia.org/wiki/Digital_object_identifier doi]: 10.2307/2334029, [http://www.jstor.org/stable/2334029?seq=1#page_scan_tab_contents JSTOR 2334029]
==1960 ...==
* [[Mathematician#FNDavid|Florence Nightingale David]] ('''1962'''). ''[http://books.google.com/books/about/Games_Gods_and_Gambling.html?id=8ddP8zNx9nQC&redir_esc=y Games, Gods & Gambling: A History of Probability and Statistical Ideas]''. Dover Publications, ISBN-13: 978-0486400235
* [[Tony Marsland]], [[Paul Rushton]] ('''1973'''). ''[http://dl.acm.org/citation.cfm?id=805703 Mechanisms for Comparing Chess Programs].'' [[ACM 1973|ACM Annual Conference]], [http://webdocs.cs.ualberta.ca/~tony/OldPapers/Marsland-Rushton-ACM73 pdf]
* [[James Gillogly]] ('''1978'''). ''Performance Analysis of the Technology Chess Program''. Ph.D. Thesis. Tech. Report CMU-CS-78-189, [[Carnegie Mellon University]], [http://reports-archive.adm.cs.cmu.edu/anon/anon/usr/ftp/scan/CMU-CS-77-gillogly.pdf CMU-CS-77 pdf] » [[Tech]]
* [https://en.wikipedia.org/wiki/Arpad_Elo Arpad Elo] ('''1978'''). ''The Rating of Chessplayers, Past and Present''. Arco Publications <ref>[http://www.anusha.com/elosbook.htm Elo's Book: The Rating of Chess Players] by [[Sam Sloan]]</ref>
* [[David Cahlander]] ('''1979'''). ''Strength of a Chess Playing Computer''. [[ICGA Journal#2_1|ICCA Newsletter, Vol. 2, No. 1]]
* [[Jack Good]] ('''1979'''). ''On the Grading of Chess Players''. [[Personal Computing#3_3|Personal Computing, Vol. 3, No. 3]], pp. 47
==1980 ...==
* [[John F. White]] ('''1981'''). ''[http://yourcomputeronline.wordpress.com/2010/12/10/survey-chess-games/ Survey-Chess Games]''. [[Your Computer]], [http://yourcomputeronline.wordpress.com/2010/10/31/augustseptember-1981-contents-and-editorial/ August/September 1981] <ref>[https://en.wikipedia.org/wiki/The_Master_Game The Master Game from Wikipedia]</ref>
* [[Ken Thompson]] ('''1982'''). ''Computer Chess Strength''. [[Advances in Computer Chess 3]]
* [[Mathematician#DavidSiegmund|David Siegmund]] ('''1985'''). ''Sequential Analysis. Tests and confidence intervals''. [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]
* [[Hans Berliner]], [[Gordon Goetsch]], [[Murray Campbell]], [[Carl Ebeling]] ('''1989'''). ''Measuring the Performance Potential of Chess Programs'', [[Advances in Computer Chess 5]]
* [[Eric Hallsworth]] ('''1989'''). ''Playing Levels''. [[Selective Search|Computer Chess News Sheet]] 23, pp 2, [http://www.chesscomputeruk.com/SS_23.pdf pdf] hosted by [[Mike Watters]]
==1990 ...==
* [[Hans Berliner]], [[Gordon Goetsch]], [[Murray Campbell]], [[Carl Ebeling]] ('''1990'''). ''Measuring the Performance Potential of Chess Programs.'' [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 43, No. 1
* [http://www.ics.uci.edu/~sternh/ Hal Stern] ('''1990'''). ''Are all Linear Paired Comparison Models Equivalent''. [http://www.dtic.mil/dtic/tr/fulltext/u2/a236856.pdf pdf]
* [[Eric Hallsworth]] ('''1990'''). ''Speed, Processors and Ratings''. [[Selective Search|Computer Chess News Sheet]] 25, pp 6, [http://www.chesscomputeruk.com/SS_25.pdf pdf] hosted by [[Mike Watters]]
* [[Hans Berliner]], [[Danny Kopec]], [[Ed Northam]] ('''1991'''). ''A taxonomy of concepts for evaluating chess strength: examples from two difficult categories''. [[Advances in Computer Chess 6]], [http://www.sci.brooklyn.cuny.edu/%7Ekopec/Publications/Publications/O_20_C.pdf pdf]
* [[Steve Maughan]] ('''1992'''). ''Are You Sure It's Better?'' [[Selective Search]] 40, pp. 21, [http://www.chesscomputeruk.com/SS_40.pdf pdf] hosted by [[Mike Watters]]
* [[Warren D. Smith]] ('''1993'''). ''Rating Systems for Gameplayers, and Learning''. [http://scorevoting.net/WarrenSmithPages/homepage/ratingspap.ps ps]
* [[Robert Hyatt]], [[Monroe Newborn]] ('''1997'''). ''CRAFTY Goes Deep''. [[ICGA Journal#20_2|ICCA Journal, Vol. 20, No. 2]] » [[Crafty]]
==2000 ...==
* [[Ernst A. Heinz]] ('''2000'''). ''[http://link.springer.com/chapter/10.1007/3-540-45579-5_18 New Self-Play Results in Computer Chess]''. [[CG 2000]]
* [[Ernst A. Heinz]] ('''2001'''). ''Self-play Experiments in Computer Chess Revisited.'' [[Advances in Computer Games 9]]
* [[Ernst A. Heinz]] ('''2001'''). ''Modeling the “Go Deep” Behaviour of CRAFTY and DARK THOUGHT.'' [[Advances in Computer Games 9]] » [[Crafty]], [[Dark Thought]]
* [[Ernst A. Heinz]] ('''2001'''). ''Self-Play, Deep Search and Diminishing Returns.'' [[ICGA Journal#24_2|ICGA Journal, Vol. 24, No. 2]]
* [[Guy Haworth]] ('''2002'''). ''[http://centaur.reading.ac.uk/5952/ Self-play: Statistical Significance]''. [[7th Computer Olympiad#Workshop|7th Computer Olympiad Workshop]]
* [[Guy Haworth]] ('''2003'''). ''[http://centaur.reading.ac.uk/4549/ Self-Play: Statistical Significance]''. [[ICGA Journal#26_2|ICGA Journal, Vol. 26, No. 2]]
* [[Ernst A. Heinz]] ('''2003'''). ''Follow-Up on Self-Play, Deep Search, and Diminishing Returns.'' [[ICGA Journal#26_2|ICGA Journal, Vol. 26, No. 2]]
* [[Mathematician#DHunter|David R. Hunter]] ('''2004'''). ''MM Algorithms for Generalized Bradley-Terry Models''. [https://en.wikipedia.org/wiki/Annals_of_Statistics The Annals of Statistics], Vol. 32, No. 1, 384–406, [http://sites.stat.psu.edu/~dhunter/papers/bt.pdf pdf] <ref>[http://remi.coulom.free.fr/Bayesian-Elo/MMNotes.pdf Handwritten Notes on the 2004 David R. Hunter Paper 'MM Algorithms for Generalized Bradley-Terry Models'] by [[Rémi Coulom]]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44715 Derivation of bayeselo formula] by [[Rémi Coulom]], [[CCC]], August 07, 2012</ref> <ref>[https://en.wikipedia.org/wiki/Mm_algorithm Mm algorithm from Wikipedia]</ref> <ref>[https://en.wikipedia.org/wiki/Pairwise_comparison Pairwise comparison from Wikipedia]</ref>
==2005 ...==
* [[Jan Renze Steenhuisen]] ('''2005'''). ''New Results in Deep-Search Behaviour''. [[ICGA Journal#28_4|ICGA Journal, Vol. 28, No. 4]], [http://www.st.ewi.tudelft.nl/%7Erenze/doc/ICGA_2005_4_DeepSearch.pdf pdf]
* [[Matej Guid]], [[Ivan Bratko]] ('''2007'''). ''Factors affecting diminishing returns for searching deeper''. [[CGW 2007]] » [[Crafty]], [[Rybka]], [[Shredder]], [[Depth#DiminishingReturns|Diminishing Returns]]
* [[Jeff Rollason]] ('''2007'''). ''[http://www.aifactory.co.uk/newsletter/2007_04_stat_minefields.htm Statistical Minefields with Version Testing]''. [[AI Factory]], Winter 2007 » [[Engine Testing]]
* [[Shogo Takeuchi]], [[Tomoyuki Kaneko]], [[Kazunori Yamaguchi]], [[Satoru Kawai]] ('''2007'''). ''Visualization and Adjustment of Evaluation Functions Based on Evaluation Values and Win Probability''. [http://www.informatik.uni-trier.de/~ley/db/conf/aaai/aaai2007.html AAAI 2007], [https://www.aaai.org/Papers/AAAI/2007/AAAI07-136.pdf pdf]
* [[Rémi Coulom]] ('''2008'''). ''[http://link.springer.com/chapter/10.1007/978-3-540-87608-3_11 Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength]''. [[CG 2008]], [http://remi.coulom.free.fr/WHR/WHR.pdf draft as pdf]
* [[Giuseppe Di Fatta]], [[Guy Haworth|Guy McCrossan Haworth]], [[Kenneth Wingate Regan]] ('''2009'''). ''Skill Rating by Bayesian Inference''. [http://www.informatik.uni-trier.de/~ley/db/conf/cidm/cidm2009.html CIDM 2009], [http://www.cse.buffalo.edu/~regan/papers/pdf/DFHR09.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Bayesian_inference Bayesian inference from Wikipedia]</ref>
* [[Guy Haworth|Guy McCrossan Haworth]], [[Kenneth Wingate Regan]], [[Giuseppe Di Fatta]] ('''2009'''). ''Performance and Prediction: Bayesian Modelling of Fallible Choice in Chess''. [[Advances in Computer Games 12]], [http://www.cse.buffalo.edu/faculty/regan/papers/pdf/HRdF10.pdf pdf]
==2010 ...==
* [[Diogo R. Ferreira]] ('''2010'''). ''[http://web.ist.utl.pt/diogo.ferreira/chess/ Predicting the Outcome of Chess Games based on Historical Data]''. [https://en.wikipedia.org/wiki/Instituto_Superior_T%C3%A9cnico IST] - [https://en.wikipedia.org/wiki/Technical_University_of_Lisbon Technical University of Lisbon] <ref>[http://blog.kaggle.com/2010/11/30/how-i-did-it-diogo-ferreira-on-4th-place-in-elo-chess-ratings-competition/ How I did it: Diogo Ferreira on 4th place in Elo chess ratings competition | no free hunch]</ref>
* [[Kenneth Wingate Regan|Kenneth W. Regan]], [[Guy Haworth]] ('''2011'''). ''[http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3779 Intrinsic Chess Ratings]''. [http://www.informatik.uni-trier.de/%7Eley/db/conf/aaai/aaai2011.html#ReganH11 AAAI 2011], [http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3779/3962 pdf], [http://www.cse.buffalo.edu/%7Eregan/Talks/IntrinsicRatings.pdf slides as pdf] <ref> [http://www.talkchess.com/forum/viewtopic.php?t=65772 "Intrinsic Chess Ratings" by Regan, Haworth -- seq] by Kai Middleton, [[CCC]], November 19, 2017</ref>
* [[Kenneth Wingate Regan]], [[Bartłomiej Macieja]], [[Guy Haworth|Guy McCrossan Haworth]] ('''2011'''). ''[http://centaur.reading.ac.uk/23800/ Understanding Distributions of Chess Performances]''. [[Advances in Computer Games 13]], [http://www.cse.buffalo.edu/~regan/papers/pdf/RMH11.pdf pdf]
* [[Trevor Fenner]], [[Mark Levene]], [[Mathematician#GLoizou|George Loizou]] ('''2011'''). ''A Discrete Evolutionary Model for Chess Players' Ratings''. [http://arxiv.org/list/physics.soc-ph/recent Physics and Society], [http://arxiv.org/abs/1103.1530v2 arXiv:1103.1530v2]
* [[Rémi Coulom]] ('''2012'''). ''Paired Comparisons with Ties: Modeling Game Outcomes in Chess''. [http://www.grappa.univ-lille3.fr/~coulom/ChessOutcomes.pdf pdf preprint] <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=471004&t=44180 Re: EloStat, Bayeselo and Ordo] by [[Rémi Coulom]], [[CCC]], June 25, 2012</ref>
* [[Diogo R. Ferreira]] ('''2012'''). ''Determining the Strength of Chess Players based on actual Play''. [[ICGA Journal#35_1|ICGA Journal, Vol. 35, No. 1]]
* [[Diogo R. Ferreira]] ('''2013'''). ''The Impact of the Search Depth on Chess Playing Strength''. [[ICGA Journal#36_2|ICGA Journal, Vol. 36, No. 2]]
* [[Miguel A. Ballicora]] ('''2014'''). ''ORDO v0.9.6 Ratings for chess and other games''. September 2014, [https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxnYXZpb3RhY2hlc3NlbmdpbmV8Z3g6NmQ0NmNhNGM4YjA3YTc5ZQ pdf] » [[Ordo]] <ref>[https://sites.google.com/site/gaviotachessengine/ordo Ordo] by [[Miguel A. Ballicora]]</ref>
* [[Don Dailey]], [[Adam Hair]], [[Mark Watkins]] ('''2014'''). ''[http://www.sciencedirect.com/science/article/pii/S1875952113000177 Move Similarity Analysis in Chess Programs]''. [http://www.journals.elsevier.com/entertainment-computing/ Entertainment Computing], Vol. 5, No. 3, [http://magma.maths.usyd.edu.au/~watkins/papers/DHW.pdf preprint as pdf] <ref>[http://www.top-5000.nl/clone.htm A Pairwise Comparison of Chess Engine Move Selections] by [[Adam Hair]], hosted by [[Ed Schroder|Ed Schröder]]</ref>
* [[Kenneth Wingate Regan|Kenneth W. Regan]], [[Tamal T. Biswas]], [[Jason Zhou]] ('''2014'''). ''Human and Computer Preferences at Chess''. [http://www.cse.buffalo.edu/~regan/papers/pdf/RBZ14aaai.pdf pdf]
* [[Erik Varend]] ('''2014'''). ''Quality of play in chess and methods for measuring''. [http://www.chessanalysis.ee/Quality%20of%20play%20in%20chess%20and%20methods%20for%20measuring.pdf pdf] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=54571 Questions regarding rating systems of humans and engines] by [[Erik Varend]], [[CCC]], December 06, 2014</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60721 chess statistics scientific article] by Nuno Sousa, [[CCC]], July 06, 2016</ref>
==2015 ...==
* [[Tamal T. Biswas]], [[Kenneth Wingate Regan|Kenneth W. Regan]] ('''2015'''). ''Quantifying Depth and Complexity of Thinking and Knowledge''. [http://www.icaart.org/EuropeanProjectSpace.aspx?y=2015 ICAART 2015], [http://www.cse.buffalo.edu/~regan/papers/pdf/BiReICAART15CR.pdf pdf]
* [[Tamal T. Biswas]], [[Kenneth Wingate Regan|Kenneth W. Regan]] ('''2015'''). ''Measuring Level-K Reasoning, Satisficing, and Human Error in Game-Play Data''. [[IEEE]] [http://www.icmla-conference.org/icmla15/ ICMLA 2015], [http://www.cse.buffalo.edu/~regan/papers/pdf/BiRe15_ICMLA2015.pdf pdf preprint]
* [[Guy Haworth]], [[Tamal T. Biswas]], [[Kenneth Wingate Regan|Kenneth W. Regan]] ('''2015'''). ''[http://centaur.reading.ac.uk/39431/ A Comparative Review of Skill Assessment: Performance, Prediction and Profiling]''. [[Advances in Computer Games 14]]
* [[Shogo Takeuchi]], [[Tomoyuki Kaneko]] ('''2015'''). ''[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7336038 Estimating Ratings of Computer Players by the Evaluation Scores and Principal Variations in Shogi]''. [http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=7335993 ACIT-CSI]
* [[Jean-Marc Alliot]] ('''2017'''). ''Who is the Master''? [[ICGA Journal#39_1|ICGA Journal, Vol. 39, No. 1]], [http://www.alliot.fr/CHESS/draft-icga-39-1.pdf draft as pdf] » [[Stockfish]], [[Jean-Marc Alliot#WhoistheMaster|Who is the Master?]]

=Forum & Blog Postings=
==1996 ...==
* [https://groups.google.com/d/msg/rec.games.chess.computer/GkgFc3jOl84/vWn-SG8kVboJ Theoretical chess rating question...] by Cyber Linguist, [[Computer Chess Forums|rgcc]], April 17, 1996
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/3014e4ea570dbf38 Statistical validity of medium-length match results] by [[Bruce Moreland]], [[Computer Chess Forums|rgcc]], February 15, 1997
* [https://www.stmintz.com/ccc/index.php?id=52542 ELO performance?] by [[Stefan Meyer-Kahlen]], [[CCC]], May 22, 1999 » [[Pawn Advantage, Win Percentage, and Elo]], [[Playing Strength]]
==2000 ...==
* [https://www.stmintz.com/ccc/index.php?id=98901 Some thougths about statistics] by Martin Schubert, [[CCC]], February 24, 2000
* [https://www.stmintz.com/ccc/index.php?id=174578 Who is better? Some statistics...] by [[Gian-Carlo Pascutto]], [[CCC]], June 11, 2001
* [https://www.stmintz.com/ccc/index.php?id=178041 Simulating the result of a single game by random numbers] by Christoph Fieberg, [[CCC]], July 03, 2001
* [https://www.stmintz.com/ccc/index.php?id=179091 Simulating the result of a single game by random numbers - Update!] by Christoph Fieberg, [[CCC]], July 10, 2001
* [https://www.stmintz.com/ccc/index.php?id=178939 Simulating the result of a single game by random numbers - Update!] by Christoph Fieberg, [[CCC]], August 02, 2001
* [https://www.stmintz.com/ccc/index.php?id=226275 ELO & statistics question] by [[Gian-Carlo Pascutto]], [[CCC]], April 26, 2002
* [https://www.stmintz.com/ccc/index.php?id=267056 Statistical significance of a match result] by [[Rémi Coulom]], [[CCC]], November 23, 2002
* [https://www.stmintz.com/ccc/index.php?id=275347 Value of playing different versions of a program against each other] by [[Tom King]], [[CCC]], January 06, 2003
* [https://www.stmintz.com/ccc/index.php?id=340148 A question about statistics...] by [[Roger Brown]], [[CCC]], January 04, 2004
* [https://www.stmintz.com/ccc/index.php?id=377487 New tool to estimate the statistical significance of match results] by [[Rémi Coulom]], [[CCC]], July 17, 2004
* [http://www.open-aurec.com/wbforum/viewtopic.php?t=949 ELOStat algorithm ?] by [[Rémi Coulom]], [[Computer Chess Forums|Winboard Forum]], December 10, 2004 » [[EloStat]]
==2005 ...==
* [https://www.stmintz.com/ccc/index.php?id=411278 bayeselo: new Elo-rating tool, applied to CCT7] by [[Rémi Coulom]], [[CCC]], February 13, 2005 » [[CCT7]]
* [https://www.stmintz.com/ccc/index.php?id=484357 table for detecting significant difference between two engines] by [[Joseph Ciarrochi]], [[CCC]], February 03, 2006 <ref>[http://www.husvankempen.de/nunn/rating/tablejoseph.htm LOS Table] by [[Joseph Ciarrochi]] from [[CEGT]]</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=14107 Observator bias or...] by [[Alessandro Scotti]], [[CCC]], May 29, 2007
* [http://www.talkchess.com/forum/viewtopic.php?t=13800 a beat b,b beat c,c beat a question] by [[Uri Blass]], [[CCC]], May 16, 2007
* [http://www.talkchess.com/forum/viewtopic.php?t=16545 how to do a proper statistical test] by [[Rein Halbersma]], [[CCC]], September 19, 2007
* [http://www.talkchess.com/forum/viewtopic.php?t=27516 Elo Calcuation] by [[Edmund Moshammer]], [[CCC]], April 19, 2009
* [http://www.talkchess.com/forum/viewtopic.php?t=30624 Likelihood of superiority] by [[Marco Costalba]], [[CCC]], November 15, 2009
==2010 ...==
* [http://www.talkchess.com/forum/viewtopic.php?t=31699 Engine Testing - Statistics] by [[Edmund Moshammer]], [[CCC]], January 14, 2010
: [http://www.talkchess.com/forum/viewtopic.php?t=31699&start=4 Re: Engine Testing - Statistics] by John Major, [[CCC]], January 14, 2010
* [http://www.talkchess.com/forum/viewtopic.php?t=34989 Chess Statistics] by [[Edmund Moshammer]], [[CCC]], June 17, 2010
* [http://www.talkchess.com/forum/viewtopic.php?t=36592 Do You really need 1000s of games for testing?] by [[Jouni Uski]], [[CCC]], November 04, 2010
* [http://www.talkchess.com/forum/viewtopic.php?t=36979 GUI idea: Testing until certainty] by [[Albert Silver]], [[CCC]], December 07, 2010
* [http://www.talkchess.com/forum/viewtopic.php?t=37056 SPRT and Engine testing] by [[Adam Hair]], [[CCC]], December 13, 2010 » [[Match Statistics#SPRT|SPRT]]
'''2011'''
* [http://www.talkchess.com/forum/viewtopic.php?t=39511 Ply vs ELO] by Andriy Dzyben, [[CCC]], June 28, 2011
* [http://www.talkchess.com/forum/viewtopic.php?t=40193 One billion random games] by [[Steven Edwards]], [[CCC]], August 27, 2011
* [http://www.talkchess.com/forum/viewtopic.php?t=41341 Increase in Elo ..Question For The Experts] by [[Steve Blincoe|Steve B]], [[CCC]], December 05, 2011
'''2012'''
* [http://www.talkchess.com/forum/viewtopic.php?t=42729 Advantage for White; Bayeselo (to Rémi Coulom)] by [[Edmund Moshammer]], [[CCC]], March 03, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=44219 Human Elo ratings: averages and standard deviations] by [[Jesús Muñoz]], [[CCC]], March 18, 2012 <ref>[http://en.chessbase.com/post/arpad-elo-and-the-elo-rating-system Arpad Elo and the Elo Rating System] by [[Dan Ross]], [[ChessBase|ChessBase News]], December 16, 2007</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=42998 Elo uncertainties calculator] by [[Jesús Muñoz]], [[CCC]], March 24, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=43134 Elo versus speed] by [[Peter Österlund]], [[CCC]], April 02, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=44003 Rybka odds matches and the strength of engines] by [[Kai Laskos]], [[CCC]], June 09, 2012 » [[Rybka]]
* [http://www.talkchess.com/forum/viewtopic.php?t=44147 A new way to compare chess programs] by [[Larry Kaufman]], [[CCC]], June 21, 2012 » [[Komodo]]
* [http://www.talkchess.com/forum/viewtopic.php?t=44180 EloStat, Bayeselo and Ordo] by [[Kai Laskos]], [[CCC]], June 24, 2012 » [[EloStat]], [[Bayeselo]], [[Ordo]]
* [http://www.talkchess.com/forum/viewtopic.php?t=44657 about error margins?] by [[Fermin Serrano]], [[CCC]], August 01, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=44670 normal vs logistic curve for elo model] by [[Daniel Shawul]], [[CCC]], August 02, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=44715 Derivation of bayeselo formula] by [[Rémi Coulom]], [[CCC]], August 07, 2012 <ref>[[Mathematician#DHunter|David R. Hunter]] ('''2004'''). ''MM Algorithms for Generalized Bradley-Terry Models''. [https://en.wikipedia.org/wiki/Annals_of_Statistics The Annals of Statistics], Vol. 32, No. 1, 384–406, [http://sites.stat.psu.edu/~dhunter/papers/bt.pdf pdf]</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=45158 Yet Another Testing Question] by [[Brian Richardson]], [[CCC]], September 15, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=45174 margin of error] by [[Larry Kaufman]], [[CCC]], September 16, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=45257 Average number of plies in {1-0, ½-½, 0-1}] by [[Jesús Muñoz]], [[CCC]], September 21, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=45287 Another testing question] by [[Larry Kaufman]], [[CCC]], September 23, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=45406 LOS calculation: Does the same result is always the same?] by [[Marco Costalba]], [[CCC]], October 01, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=45788 LOS (again)] by [[Ed Schroder|Ed Schröder]], [[CCC]], October 30, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=46370 Elo points gain from doubling time] by [[Kai Laskos]], [[CCC]], December 10, 2012
* [http://www.talkchess.com/forum/viewtopic.php?t=46572 A word for casual testers] by [[Don Dailey]], [[CCC]], December 25, 2012
'''2013'''
* [http://www.talkchess.com/forum/viewtopic.php?t=46759 A poor man's testing environment] by [[Ed Schroder|Ed Schröder]], [[CCC]], January 04, 2013 <ref>[http://www.top-5000.nl/tuning.htm Testing a chess engine from the ground up] from [http://www.top-5000.nl/ Home of the Dutch Rebel] by [[Ed Schroder|Ed Schröder]]</ref> » [[Engine Testing]]
* [http://www.talkchess.com/forum/viewtopic.php?t=46786 Noise in ELO estimators: a quantitative approach] by [[Marco Costalba]], [[CCC]], January 06, 2013
* [http://www.talkchess.com/forum/viewtopic.php?t=47086 Updated Dendrogram] by [[Kai Laskos]], [[CCC]], February 02, 2013
* [http://www.talkchess.com/forum/viewtopic.php?t=47469 Experiment: influence of colours at fixed depth] by [[Jesús Muñoz]], [[CCC]], March 10, 2013
* [http://www.open-chess.org/viewtopic.php?f=5&t=2296 LOS] by [[Mark Watkins|BB+]], [[Computer Chess Forums|OpenChess Forum]], March 31, 2013
* [http://www.talkchess.com/forum/viewtopic.php?t=47885 Fishtest Distributed Testing Framework] by [[Marco Costalba]], [[CCC]], May 01, 2013
* [http://www.talkchess.com/forum/viewtopic.php?t=48649 The influence of the length of openings] by [[Kai Laskos]], [[CCC]], July 14, 2013
* [http://www.talkchess.com/forum/viewtopic.php?t=48733 Scaling at 2x nodes (or doubling time control)] by [[Kai Laskos]], [[CCC]], July 23, 2013 » [[Match Statistics#DoublingTC|Doubling TC]], [[Depth#DiminishingReturns|Diminishing Returns]], [[Playing Strength]], [[Houdini]]
* [http://www.talkchess.com/forum/viewtopic.php?t=48863 Type I error in LOS based early stopping rule] by [[Kai Laskos]], [[CCC]], August 06, 2013 <ref>[https://en.wikipedia.org/wiki/Type_I_and_type_II_errors Type I and type II errors from Wikipedia]</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=48864 How much elo is pondering worth] by [[Michel Van den Bergh]], [[CCC]], August 07, 2013 » [[Pondering]]
* [http://www.talkchess.com/forum/viewtopic.php?t=49248 Contempt and the ELO model] by [[Michel Van den Bergh]], [[CCC]], September 05, 2013 » [[Contempt Factor]]
* [http://www.talkchess.com/forum/viewtopic.php?t=49393 1 draw=1 win + 1 loss (always!)] by [[Michel Van den Bergh]], [[CCC]], September 19, 2013
* [http://www.talkchess.com/forum/viewtopic.php?t=49584 SPRT and narrowing of (elo1 - elo0) difference] by [[Jesús Muñoz]], [[CCC]], October 05, 2013 » [[Match Statistics#SPRT|SPRT]]
* [http://www.talkchess.com/forum/viewtopic.php?t=49727 sprt and margin of error] by [[Larry Kaufman]], [[CCC]], October 15, 2013 » [[Match Statistics#SPRT|SPRT]]
* [http://www.open-chess.org/viewtopic.php?f=5&t=2477 How (not) to use SPRT ?] by [[Mark Watkins|BB+]], [[Computer Chess Forums|OpenChess Forum]], October 19, 2013
* [http://www.talkchess.com/forum/viewtopic.php?t=50266 Houdini, much weaker engines, and Arpad Elo] by [[Kai Laskos]], [[CCC]], November 29, 2013 » [[Houdini]], [[Pawn Advantage, Win Percentage, and Elo]] <ref>[https://en.wikipedia.org/wiki/Arpad_Elo Arpad Elo - Wikipedia]</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=50332 Testing on time control versus nodes | ply] by [[Ed Schroder|Ed Schröder]], [[CCC]], December 04, 2013
'''2014'''
* [http://www.talkchess.com/forum/viewtopic.php?t=51003 Calculating the LOS (likelihood of superiority) from results] by Robert Tournevisse, [[CCC]], January 22, 2014
* [http://www.open-chess.org/viewtopic.php?f=5&t=2578 LOS --> Draws are irrelevant] by [[Dann Corbit|User923005]], [[Computer Chess Forums|OpenChess Forum]], January 24, 2014
* [http://www.talkchess.com/forum/viewtopic.php?t=52746 Empirically 1 win + 1 loss ~ 2 draws] by [[Kai Laskos]], [[CCC]], June 24, 2014
* [http://www.talkchess.com/forum/viewtopic.php?t=53645 Ordo 0.9.6] by [[Miguel A. Ballicora]], [[CCC]], September 10, 2014 » [[Ordo]]
* [https://chesscomputer.tumblr.com/post/98632536555/using-the-stockfish-position-evaluation-score-to/embed Using the Stockfish position evaluation score to predict victory probability] by unavoidablegrain, [https://en.wikipedia.org/wiki/Tumblr Tumblr], September 28, 2014 » [[Pawn Advantage, Win Percentage, and Elo]], [[Stockfish]]
* [http://www.talkchess.com/forum/viewtopic.php?t=53891 Elo estimation using quasi-Monte Carlo integration] by Branko Radovanovic, [[CCC]], September 30, 2014
* [http://www.talkchess.com/forum/viewtopic.php?t=54331 SPRT question] by [[Robert Hyatt]], [[CCC]], November 13, 2014 » [[Match Statistics#SPRT|SPRT]]
* [http://www.talkchess.com/forum/viewtopic.php?t=54359 Usage sprt / cutechess-cli] by [[Michael Hoffmann]], [[CCC]], November 16, 2014 » [[Cutechess-cli]], [[Match Statistics#SPRT|SPRT]]
==2015 ...==
* [http://www.talkchess.com/forum/viewtopic.php?t=55130 2-SPRT] by [[Michel Van den Bergh]], [[CCC]], January 28, 2015 » [[Match Statistics#SPRT|SPRT]]
* [http://www.talkchess.com/forum/viewtopic.php?t=55893 Script for computing SPRT probabilities] by [[Michel Van den Bergh]], [[CCC]], April 05, 2015
* [http://www.talkchess.com/forum/viewtopic.php?t=56067 Maximum ELO gain per test game played?] by Forrest Hoch, [[CCC]], April 20, 2015
* [http://www.talkchess.com/forum/viewtopic.php?t=56095 Getting SPRT right] by [[Alexandru Mosoi]], [[CCC]], April 22, 2015 » [[Match Statistics#SPRT|SPRT]]
* [http://www.talkchess.com/forum/viewtopic.php?t=56358 SPRT questions] by [[Uri Blass]], [[CCC]], May 15, 2015 » [[Match Statistics#SPRT|SPRT]]
* [http://www.talkchess.com/forum/viewtopic.php?t=56426 Adam Hair's article on Pairwise comparison of engines] by [[Charles Roberson]], [[CCC]], May 19, 2015 <ref>[http://www.top-5000.nl/clone.htm A Pairwise Comparison of Chess Engine Move Selections] by [[Adam Hair]], hosted by [[Ed Schroder|Ed Schröder]]</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=57223 computing elo of multiple chess engines] by [[Alexandru Mosoi]], [[CCC]], August 09, 2015
* [http://www.talkchess.com/forum/viewtopic.php?t=57270 Some musings about search] by [[Ed Schroder|Ed Schröder]], [[CCC]], August 14, 2015 » [[Automated Tuning]], [[Search]]
* [http://www.talkchess.com/forum/viewtopic.php?t=57437 Bullet vs regular time control, say 40/4m CCRL/CEGT] by [[Ed Schroder|Ed Schröder]], [[CCC]], August 29, 2015
* [http://www.talkchess.com/forum/viewtopic.php?t=57465 The SPRT without draw model, elo model or whatever...] by [[Michel Van den Bergh]], [[CCC]], September 01, 2015 » [[Match Statistics#SPRT|SPRT]]
: [http://talkchess.com/forum/viewtopic.php?t=57465&start=19 Re: The SPRT without draw model, elo model or whatever..] by [[Michel Van den Bergh]], [[CCC]], August 18, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=57482 Name for elo without draws?] by [[Marcel van Kervinck]], [[CCC]], September 02, 2015
* [http://www.talkchess.com/forum/viewtopic.php?t=57696 The future of chess and elo ratings] by [[Larry Kaufman]], [[CCC]], September 20, 2015 » [[Opening Book]]
* [https://rjlipton.wordpress.com/2015/10/06/depth-of-satisficing/ Depth of Satisficing] by [[Kenneth Wingate Regan|Ken Regan]], [https://rjlipton.wordpress.com/ Gödel's Lost Letter and P=NP], October 06, 2015 » [[Depth]], [[Match Statistics]], [[Pawn Advantage, Win Percentage, and Elo]], [[Stockfish]], [[Komodo]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=57890 Regan's latest: Depth of Satisficing] by [[Carl Lumma]], [[CCC]], October 09, 2015</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=57969 ELO error margin] by [[Fabio Gobbato]], [[CCC]], October 17, 2015
* [http://www.talkchess.com/forum/viewtopic.php?t=58067 testing multiple versions & elo calculation] by [[Folkert van Heusden]], [[CCC]], October 27, 2015
* [http://www.talkchess.com/forum/viewtopic.php?t=58543 A simple expression] by [[Kai Laskos]], [[CCC]], December 09, 2015
* [http://www.talkchess.com/forum/viewtopic.php?t=58600 Counting 1 win + 1 loss as 2 draws] by [[Kai Laskos]], [[CCC]], December 15, 2015
'''2016'''
* [https://rjlipton.wordpress.com/2016/01/21/a-chess-firewall-at-zero/ A Chess Firewall at Zero?] by [[Kenneth Wingate Regan|Ken Regan]], [https://rjlipton.wordpress.com/ Gödel's Lost Letter and P=NP], January 21, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=59038 Ordo 1.0.9 (new features for testers)] by [[Miguel A. Ballicora]], [[CCC]], January 25, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=59333 Why the errorbar is wrong ... simple example!] by [[Frank Quisinsky]], [[CCC]], February 23, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=59332 a direct comparison of FIDE and CCRL rating systems] by [[Erik Varend]], [[CCC]], February 22, 2016 » [[FIDE]], [[CCRL]]
* [http://www.talkchess.com/forum/viewtopic.php?t=59406 Some properties of the Type I error in p-value stopping rule] by [[Kai Laskos]], [[CCC]], March 01, 2016
* [https://blog.ebemunk.com/a-visual-look-at-2-million-chess-games/ A Visual Look at 2 Million Chess Games - Thinking Through the Party] by [[Buğra Fırat]], March 02, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=60508 Type I error for p-value stopping: Balanced and Unbalanced] by [[Kai Laskos]], [[CCC]], June 16, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=60791 Empirically Logistic ELO model better suited than Gaussian] by [[Kai Laskos]], [[CCC]], July 12, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=60960 Testing resolution and combining results] by [[Daniel José Queraltó]], [[CCC]], July 28, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=61105 Error margins via resampling (jackknifing)] by [[Kai Laskos]], [[CCC]], August 12, 2016 <ref>[https://en.wikipedia.org/wiki/Resampling_(statistics) Resampling (statistics) from Wikipedia]</ref> <ref>[https://en.wikipedia.org/wiki/Jackknife_resampling Jackknife resampling from WIkipedia]</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=61245 Properties of unbalanced openings using Bayeselo model] by [[Kai Laskos]], [[CCC]], August 27, 2016 » [[Opening Book]]
* [http://www.talkchess.com/forum/viewtopic.php?t=61444 ELO inflation ha ha ha] by [[Henk van den Belt]], [[CCC]], September 16, 2016 » [[Delphil]], [[Stockfish]], [[Playing Strength]], [[TCEC Season 9]] <ref>[http://tcec.chessdom.com/archive.php?se=9&rapid&ga=163 Delphil 3.3b2 (2334) - Stockfish 030916 (3228), TCEC Season 9 - Rapid, Round 11], September 16, 2016</ref>
: [http://www.talkchess.com/forum/viewtopic.php?t=61444&start=17 About expected scores and draw ratios] by [[Jesús Muñoz]], [[CCC]], September 17, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=61506 The scaling with time of opening books] by [[Kai Laskos]], [[CCC]], September 23, 2016 » [[Opening Book]]
* [http://www.talkchess.com/forum/viewtopic.php?t=61548 Perfect play] by Patrik Karlsson, [[CCC]], September 28, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=61601 Stockfish underpromotes much more often than Komodo] by [[Kai Laskos]], [[CCC]], October 05, 2016 » [[Komodo]], [[Promotions]], [[Stockfish]]
* [http://www.talkchess.com/forum/viewtopic.php?t=61636 Differences between top engines related to "style"] by [[Kai Laskos]], October 07, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=61781 SPRT when not used for self testing] by [[Andrew Grant]], [[CCC]], October 21, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=61784 Doubling of time control] by [[Andreas Strangmüller]], [[CCC]], October 21, 2016 » [[Match Statistics#DoublingTC|Doubling TC]], [[Depth#DiminishingReturns|Diminishing Returns]], [[Playing Strength]], [[Komodo]]
* [http://www.talkchess.com/forum/viewtopic.php?t=62146 Stockfish 8 - Double time control vs. 2 threads] by [[Andreas Strangmüller]], [[CCC]], November 15, 2016 » [[Match Statistics#DoublingTC|Doubling TC]], [[Depth#DiminishingReturns|Diminishing Returns]], [[Playing Strength]], [[Stockfish]]
* [https://rjlipton.wordpress.com/2016/11/30/when-data-serves-turkey/ When Data Serves Turkey] by [[Kenneth Wingate Regan|Ken Regan]], [https://rjlipton.wordpress.com/ Gödel's Lost Letter and P=NP], November 30, 2016
* [https://rjlipton.wordpress.com/2016/12/08/magnus-and-the-turkey-grinder/ Magnus and the Turkey Grinder] by [[Kenneth Wingate Regan|Ken Regan]], [https://rjlipton.wordpress.com/ Gödel's Lost Letter and P=NP], December 08, 2016 » [[Pawn Advantage, Win Percentage, and Elo]] <ref>[https://en.wikipedia.org/wiki/World_Chess_Championship_2016 World Chess Championship 2016 from Wikipedia]</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=62435 Regan's conundrum] by [[Carl Lumma]], [[CCC]], December 09, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=62438 Statistical Interpretation] by [[Dennis Sceviour]], [[CCC]], December 10, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=62510 Absolute ELO scale] by [[Nicu Ionita]], [[CCC]], December 17, 2016
* [http://www.talkchess.com/forum/viewtopic.php?t=62598 A question about SPRT] by [[Andrew Grant]], [[CCC]], December 25, 2016 » [[Match Statistics#SPRT|SPRT]]
* [http://www.talkchess.com/forum/viewtopic.php?t=62622 Diminishing returns and hyperthreading] by [[Kai Laskos]], [[CCC]], December 27, 2016 » [[Depth#DiminishingReturns|Diminishing Returns]], [[Playing Strength]], [[Thread]]
'''2017'''
* [http://www.talkchess.com/forum/viewtopic.php?t=62868 Progress in 30 years by four intervals of 7-8 years] by [[Kai Laskos]], [[CCC]], January 19, 2017 » [[Playing Strength]]
* [http://www.talkchess.com/forum/viewtopic.php?t=62922 sprt tourney manager] by [[Richard Delorme]], [[CCC]], January 24, 2017 » [[Amoeba#TournamentManager|Amoeba Tournament Manager]], [[Match Statistics#SPRT|SPRT]]
* [http://www.talkchess.com/forum/viewtopic.php?t=63327 Binomial distribution for chess statistics] by [[Lyudmil Antonov]], [[CCC]], March 03, 2017
* [http://www.talkchess.com/forum/viewtopic.php?t=63355 Higher than expected by me efficiency of Ponder ON] by [[Kai Laskos]], [[CCC]], March 06, 2017 » [[Pondering]]
* [http://www.talkchess.com/forum/viewtopic.php?t=63572 What can be said about 1 - 0 score?] by [[Kai Laskos]], [[CCC]], March 28, 2017
* [http://www.talkchess.com/forum/viewtopic.php?t=63652 6-men Syzygy from HDD and USB 3.0] by [[Kai Laskos]], [[CCC]], April 04, 2017 » [[Komodo]], [[Playing Strength]], [[Syzygy Bases]], [[Memory#USB3|USB 3.0]]
* [http://www.talkchess.com/forum/viewtopic.php?t=63687 Scaling of engines from FGRL rating list] by [[Kai Laskos]], [[CCC]], April 07, 2017 » [[FGRL]]
* [http://www.talkchess.com/forum/viewtopic.php?t=63763 Low impact of opening phase in engine play?] by [[Kai Laskos]], [[CCC]], April 18, 2017 » [[Opening]]
* [http://www.talkchess.com/forum/viewtopic.php?t=63813 How to simulate a game outcome given Elo difference?] by [[Nicu Ionita]], [[CCC]], April 25, 2017
* [http://www.talkchess.com/forum/viewtopic.php?t=63875 Wilo rating properties from FGRL rating lists] by [[Kai Laskos]], [[CCC]], May 01, 2017 » [[FGRL]]
* [http://www.talkchess.com/forum/viewtopic.php?t=63888 MATCH sanity] by [[Ed Schroder]], [[CCC]], May 03, 2017 » [[Portable Game Notation]]
* [http://www.talkchess.com/forum/viewtopic.php?t=63903 Symmetric multiprocessing (SMP) scaling - SF8 and K10.4] by [[Andreas Strangmüller]], [[CCC]], May 05, 2017 » [[Lazy SMP]], [[Komodo]], [[Stockfish]]
* [http://www.talkchess.com/forum/viewtopic.php?t=63955 Symmetric multiprocessing (SMP) scaling - K10.4 Contempt=0] by [[Andreas Strangmüller]], [[CCC]], May 11, 2017 » [[SMP]], [[Komodo]], [[Contempt Factor]]
* [http://www.talkchess.com/forum/viewtopic.php?t=63967 Symmetric multiprocessing (SMP) scaling - SF8 Contempt=10] by [[Andreas Strangmüller]], [[CCC]], May 13, 2017 » [[SMP]], [[Stockfish]], [[Contempt Factor]]
* [http://www.talkchess.com/forum/viewtopic.php?t=64084 Likelihood Of Success (LOS) in the real world] by [[Kai Laskos]], [[CCC]], May 26, 2017
* [http://www.talkchess.com/forum/viewtopic.php?t=64358 Opening testing suites efficiency] by [[Kai Laskos]], [[CCC]], June 21, 2017 » [[Engine Testing]], [[Opening]]
* [http://www.talkchess.com/forum/viewtopic.php?t=64394 Testing A against B by playing a pool of others] by [[Andrew Grant]], [[CCC]], June 24, 2017
* [http://www.talkchess.com/forum/viewtopic.php?t=64519 Engine testing & error margin ?] by [[Mahmoud Uthman]], [[CCC]], July 05, 2017
* [http://www.talkchess.com/forum/viewtopic.php?t=64683 Invariance with time control of rating schemes] by [[Kai Laskos]], [[CCC]], July 22, 2017 <ref>[http://hardy.uhasselt.be/Toga/normalized_elo.pdf Normalized Elo] (pdf) by [[Michel Van den Bergh]]</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=64719 Ways to avoid "Draw Death" in Computer Chess] by [[Kai Laskos]], [[CCC]], July 25, 2017
* [http://www.talkchess.com/forum/viewtopic.php?t=64824&start=2 SMP NPS measurements] by [[Peter Österlund]], [[CCC]], August 06, 2017 » [[Lazy SMP]], [[Parallel Search]], [[Nodes per second]]
: [http://www.talkchess.com/forum/viewtopic.php?t=64824&start=3 ELO measurements] by [[Peter Österlund]], [[CCC]], August 06, 2017 » [[Playing Strength]]
* [http://www.talkchess.com/forum/viewtopic.php?t=65061 What is a Match?] by [[Henk van den Belt]], [[CCC]], September 01, 2017
* [http://www.talkchess.com/forum/viewtopic.php?t=65288 Scaling from FGRL results with top 3 engines] by [[Kai Laskos]], [[CCC]], September 26, 2017 » [[FGRL]], [[Houdini]], [[Komodo]], [[Stockfish]]
* [http://www.talkchess.com/forum/viewtopic.php?t=65764 Statistical interpretation of search and eval scores] by [[J. Wesley Cleveland]], [[CCC]], November 18, 2017 » [[Pawn Advantage, Win Percentage, and Elo]], [[Score]]
* [http://www.talkchess.com/forum/viewtopic.php?t=65772 "Intrinsic Chess Ratings" by Regan, Haworth -- seq] by Kai Middleton, [[CCC]], November 19, 2017
: [http://www.talkchess.com/forum/viewtopic.php?t=65772&start=2 Re: "Intrinsic Chess Ratings" by Regan, Haworth --] by [[Kenneth Wingate Regan|Kenneth Regan]], [[CCC]], November 20, 2017 » [[Jean-Marc Alliot#WhoistheMaster|Who is the Master?]]
* [http://www.talkchess.com/forum/viewtopic.php?t=66000 ELO progression measured by year] by [[Ed Schroder]], [[CCC]], December 13, 2017
'''2018'''
* [https://groups.google.com/d/msg/fishcooking/nqgLNUfjkok/gfMr7amXCAAJ Wrong use of SPRT] by [[Uri Blass]], [[Computer Chess Forums|FishCooking]], February 09, 2018 » [[Contempt Factor]], [[Match Statistics#SPRT|SPRT]]
* [http://www.talkchess.com/forum/viewtopic.php?t=66775 Feed bayeselo with pure game results without PGN] by [[Sergei Markoff|Sergei S. Markoff]], [[CCC]], March 08, 2018
* [http://www.talkchess.com/forum/viewtopic.php?t=66793 Elo measurement of contempt in SF in self-play] by [[Michel Van den Bergh]], [[CCC]], March 10, 2018 » [[Contempt Factor|Contempt]], [[Playing Strength]], [[Stockfish]]
* [http://www.talkchess.com/forum/viewtopic.php?t=66821 Time control envelope in top engines could be improved?] by [[Kai Laskos]], [[CCC]], March 13, 2018 » [[Time Management]]
* [http://www.talkchess.com/forum/viewtopic.php?t=66945 LCZero: Progress and Scaling. Relation to CCRL Elo] by [[Kai Laskos]], [[CCC]], March 28, 2018 » [[LCZero]]
* [http://www.talkchess.com/forum/viewtopic.php?t=66969 Elostat Question] by [[Michael Sherwin]], [[CCC]], March 30, 2018

=External Links=
* [http://www.top-5000.nl/tuning.htm Testing a chess engine from the ground up] from [http://www.top-5000.nl/ Home of the Dutch Rebel] by [[Ed Schroder|Ed Schröder]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=46759 A poor man's testing environment] by [[Ed Schroder|Ed Schröder]], [[CCC]], January 04, 2013</ref> » [[Engine Testing]]
* [http://www.top-5000.nl/match.htm MATCH - eng-eng utility] by [[Ed Schroder|Ed Schröder]]
* [http://walkofmind.com/programming/chess/mat_stats.html Statistics of material imbalances in chess games] by [[Alessandro Scotti]] » [[Material]]
==Rating Systems==
* [https://en.wikipedia.org/wiki/Chessmetrics Chessmetrics from Wikipedia]
* [https://en.wikipedia.org/wiki/Chess_rating_systems Chess rating system from Wikipedia]
* [https://en.wikipedia.org/wiki/Elo_rating_system Elo rating system from Wikipedia]
* [http://en.chessbase.com/post/arpad-elo-and-the-elo-rating-system Arpad Elo and the Elo Rating System] by [[Dan Ross]], [[ChessBase|ChessBase News]], December 16, 2007
* [http://wismuth.com/elo/calculator.html Elo Win Probability Calculator] by [[Mathematician#FLabelle|François Labelle]]
* [http://www.husvankempen.de/nunn/rating/tablejoseph.htm LOS Table] by [[Joseph Ciarrochi]] from [[CEGT]] <ref>[https://www.stmintz.com/ccc/index.php?id=484357 table for detecting significant difference between two engines] by [[Joseph Ciarrochi]], [[CCC]], February 03, 2006</ref>
* [http://kirill-kryukov.com/chess/kcec/draw_rate.html Kirr's Chess Engine Comparison KCEC - Draw rate] » [[KCEC]]
* [http://www.chessanalysis.ee/chessanalysiseng.htm Chessanalysis homepage] by [[Erik Varend]] <ref>[http://www.hiarcs.net/forums/viewtopic.php?t=8526 an interesting study from Erik Varend] by scandien, [[Computer Chess Forums|Hiarcs Forum]], August 13, 2017</ref>
* [http://www.alliot.fr/CHESS/ficga.html.en Who is the Master?] by [[Jean-Marc Alliot]] » [[Jean-Marc Alliot#WhoistheMaster|Who is the Master?]]
* [https://news.cnrs.fr/articles/how-should-chess-players-be-rated How Should Chess Players Be Rated?] by [https://news.cnrs.fr/authors/martin-koppe Martin Koppe], [https://news.cnrs.fr/ CNRS News], April 25, 2017
* [https://en.chessbase.com/post/ranking-chess-players-according-to-the-quality-of-their-moves Ranking chess players according to the quality of their moves] by [[Frederic Friedel]], [[ChessBase|ChessBase News]], April 27, 2017
* [http://rebel13.nl/misc/stats.html Rating List Stats] by [[Ed Schroder]] » [[CCRL]]
==Tools==
* [http://remi.coulom.free.fr/Bayesian-Elo/ BayesElo] by [[Rémi Coulom]] builds tournament-statistics from a [[Portable Game Notation|pgn-file]]
* [http://wbec-ridderkerk.nl/html/download.htm Elostat] by [[Frank Schubert]] builds tournament-statistics from a pgn-file
* [http://www.dirtychess.com/tools.php Online Elo-Calculator] by [[Pradu Kannan]]
* [https://sites.google.com/site/gaviotachessengine/ordo Ordo] by [[Miguel A. Ballicora]]
==Statistics==
* [https://en.wikipedia.org/wiki/Statistics Statistics from Wikipedia]
* [https://en.wikipedia.org/wiki/Statistical_assumptions Statistical assumption from Wikipedia]
* [https://en.wikipedia.org/wiki/Statistical_inference Statistical inference from Wikipedia]
* [https://en.wikipedia.org/wiki/Probability_theory Probability theory from Wikipedia]
* [https://en.wikipedia.org/wiki/Probability Probability from Wikipedia]
* [https://en.wikipedia.org/wiki/Probability_density_function Probability density function from Wikipedia]
* [https://en.wikipedia.org/wiki/Likelihood_function Likelihood function from Wikipedia]
* [https://en.wikipedia.org/wiki/P-value p-value from Wikipedia]
* [https://en.wikipedia.org/wiki/Misunderstandings_of_p-values Misunderstandings of p-values from Wikipedia]
* [https://en.wikipedia.org/wiki/Probability_distribution Probability distribution from Wikipedia]
* [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution from Wikipedia]
* [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function from Wikipedia]
* [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution from Wikipedia]
* [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution from Wikipedia]
* [https://en.wikipedia.org/wiki/Normal_distribution Normal distribution from Wikipedia]
* [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation from Wikipedia]
* [https://en.wikipedia.org/wiki/Standard_error_%28statistics%29 Standard error from Wikipedia]
* [https://en.wikipedia.org/wiki/Error_bar Error bar from Wikipedia]
* [https://en.wikipedia.org/wiki/Margin_of_error Margin of error from Wikipedia]
* [https://en.wikipedia.org/wiki/Confidence_interval Confidence interval from Wikipedia]
* [https://en.wikipedia.org/wiki/Statistical_hypothesis_testing Statistical hypothesis testing from Wikipedia]
* [https://en.wikipedia.org/wiki/Sequential_probability_ratio_test Sequential probability ratio test from Wikipedia]
* [https://en.wikipedia.org/wiki/Null_hypothesis Null hypothesis from Wikipedia]
* [https://en.wikipedia.org/wiki/Alternative_hypothesis Alternative hypothesis from Wikipedia]
* [https://en.wikipedia.org/wiki/Two-tailed_test Two-tailed test from Wikipedia]
* [https://en.wikipedia.org/wiki/Type_I_and_type_II_errors Type I and type II errors from Wikipedia]
* [https://www.khanacademy.org/math/probability/statistics-inferential/hypothesis-testing/v/type-1-errors Type 1 Errors | Hypothesis testing with one sample] | [https://en.wikipedia.org/wiki/Khan_Academy Khan Academy]
==Data Visualization==
* [http://www.top-5000.nl/clone.htm A Pairwise Comparison of Chess Engine Move Selections] by [[Adam Hair]], hosted by [[Ed Schroder|Ed Schröder]] <ref>[https://en.wikipedia.org/wiki/UPGMA UPGMA from Wikipedia]</ref> <ref>[http://www.southampton.ac.uk/~re1u06/teaching/upgma/ UPGMA Worked Example] by [http://www.southampton.ac.uk/biosci/about/staff/re1u06.page? Richard Edwards]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=56426 Adam Hair's article on Pairwise comparison of engines] by [[Charles Roberson]], [[CCC]], May 19, 2015</ref> <ref>[[Don Dailey]], [[Adam Hair]], [[Mark Watkins]] ('''2014'''). ''[http://www.sciencedirect.com/science/article/pii/S1875952113000177 Move Similarity Analysis in Chess Programs]''. [http://www.journals.elsevier.com/entertainment-computing/ Entertainment Computing], Vol. 5, No. 3, [http://magma.maths.usyd.edu.au/~watkins/papers/DHW.pdf preprint as pdf]</ref>
* [https://blog.ebemunk.com/a-visual-look-at-2-million-chess-games/ A Visual Look at 2 Million Chess Games - Thinking Through the Party] by [[Buğra Fırat]], March 02, 2016 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=65610 A Visual Look at 2 Million Chess Games] by Brahim Hamadicharef, [[CCC]], November 02, 2017</ref>
* [https://github.com/ebemunk/chess-dataviz GitHub - ebemunk/chess-dataviz: chess visualization library written for d3.js] by [[Buğra Fırat]] » [[JavaScript]]
* [https://github.com/ebemunk/pgnstats GitHub - ebemunk/pgnstats: parses PGN files and extracts statistics for chess games] by [[Buğra Fırat]] » [[Go (Programming Language)]], [[Portable Game Notation]]
==Misc==
* [[Videos#JeffBeck|Jeff Beck]], [[Videos#JanHammer|Jan Hammer]], [https://en.wikipedia.org/wiki/Fernando_Saunders Fernando Saunders], [[Videos#SimonPhillips|Simon Phillips]] - [https://en.wikipedia.org/wiki/Definitely_Maybe_%28disambiguation%29 Definitely Maybe], [https://en.wikipedia.org/wiki/YouTube YouTube] Video
: [https://en.wikipedia.org/wiki/ARMS_Charity_Concerts ARMS Charity Concert], [https://en.wikipedia.org/wiki/Madison_Square_Garden Madison Square Garden], [http://forums.ledzeppelin.com/index.php?/topic/394-arms-benefit-concerts-in-nyc-dec-1983/ December 08, 1983]
: {{#evu:https://www.youtube.com/watch?v=IdVJw-b3HHE|alignment=left|valignment=top}}

=References=
<references />

'''[[Engine Testing|Up one level]]'''

Navigation menu