Endgame Tablebases

From Chessprogramming wiki
Jump to: navigation, search

Home * Knowledge * Endgame Tablebases

Samuel Bak - Persistence [1]

Endgame Tablebases, (EGTBs or EGTs)
precalculated endgame tables generated by an exhaustive retrograde analysis. During its game play and/or search, recognizing a specific material composition, a chess program can probe, or in principle compute these tables to determine the outcome of positions definitively and act as an oracle providing the optimal moves. The tables are usually divided into separate files for each material configuration and side to move. What all different formats of tablebases have in common is that every possible piece placement has its own, unique index number which represents the position where the information about the position is located in the file. Positions with non-null castling rights could be represented in an EGTB but typically are not.

The term Tablebase opposed to endgame database was coined by Steven Edwards by means of a data vector and a single file access for both sides to move, when he introduced his Edwards' Tablebases, addressing several shortcomings of Ken Thompson's earlier database of positions with the weaker and hence potentially losing side to move only [2] .

Purposes

When a chess game goes to the endgame phase, it may complete soon with the result as a draw or a win/loss. Both human/chess engines can finish the endgame by continuing searching, applying some endgame rules. Even the game can finish in this way but it is not in a perfect way: it takes time to compute moves, may use more moves than necessary. Sometimes the winning side may lose the change since the rule 50 moves and/or making blunders. Both human and chess engines may miss badly the chances if they don’t know or not good enough about the endgame they are playing.

An EGTB can help to solve this period. It is a kind of a dictionary of all endgame positions which can answer immediately for a given position:

  • If it is a draw or a win/loss
  • How far/how many move from matting/being mated
  • What is the best move

The main advantages of using EGTBs:

  • Engines don’t have to search but retrieves data thus they can move instantly
  • Moves are the perfect ones, leading the shortest line to win
  • Engines don’t need to have the knowledge and code about those endgames

The main disadvantages of using EGTBs:

  • Their data typically are too large for downloading and storing
  • They may slow down the search
  • Require complicated codes for probing

Data

An EGTB is a set of endgames. Each endgame is a set of records about positions’ information. All involving positions in an endgame must have the same material. Typically each position's record associates with an integer number which informs how far that position is from mating/being mated or converting (depends on the type of its metrics).

Based on values of those integer numbers it can answer directly two questions of the EGTB: 1) for a given position it is a draw or a win/loss position and 2) how far it is from mating/being mated or converting. To answer the 3rd question, the best move of a position can be indirectly calculated: generate all legal moves, make them, probe all new positions, compare probing scores for the highest one and the associated move is the best one.

Theoretically, we can add more information to each position’s record. However, due to large numbers of involving positions, any additional information may make the whole EGTB becomes significantly larger. Thus so far none of popular EGTBs store any extra information and their records actually contain those integers only.

All records of an endgame are simply organized as two arrays (one array for one side/color). In other words, each item in those arrays is an integer and its index on an array is mapped to a unique position. For a given chess position, we will calculate its index then access its data record/integer via that index without searching.

Endgame files

Those arrays of an endgame can be stored in files. Typically each array is stored in one file. However, sometimes they can be combined into one larger file or splitted into more files to easier copy or manage. Those files may be in compressed or uncompressed forms. If they are compressed, they are typically divided and compressed by small blocks thus they don't need to load and decompress all data to extract just a position.

In memory

Some chess engines create themselves EGTBs on-flight and keep all data in memory without saving/loading to files. Due to limit by initial time they can generate and use very small EGTBs only.

Sizes

The size of an endgame depends on:

  • The number of involving pieces: more pieces, the bigger size
  • Type of pieces: for standard chess, more Pawns is the smaller size
  • Type of metric
  • The largest value: the integer number should be large enough to cover the largest value. Different endgames will have different largest values. Thus some endgames need only one byte per item but some need 2 bytes
  • Indexing: The algorithms to map between an array item and a position
  • Compress ratio

So far the size of an EGTB is the most important factor of success.

Metrics

DTM

Depth to Mate 'Mate' is the ultimate goal and ends the game. For each position that is represented, 'DTM' indicates the theoretical value, and the number of winner's moves to 'Mate' if won/lost - assuming that the winner is minimizing and the loser is maximizing DTM.

DTC

Depth to Conversion 'Conversion' is the changing of the force on the board, by capture and/or Pawn-conversion. For each position that is represented, 'DTC' indicates the theoretical value, and the number of winner's moves to the goal of 'Won Conversion' if won/lost - assuming that the winner is minimizing and the loser is maximizing DTC. DTC £ DTM.

DTZ

Depth to Zeroing (of the ply count) The ply-count is zeroed if and only if a Pawn is pushed or a capture is made. For each position that is represented, 'DTZ' indicates the theoretical value, and the number of winner's moves to the goal of 'Win and ply-count zeroed' if won/lost - assuming that the winner is minimizing and the loser is maximizing DTZ. DTZ £ DTC £ DTM.

DTZ50

Depth to Zeroing (of the ply count) in the context of the 50-Move Rule The DTC, DTM and DTZ metrics do not consider the current FIDE 50-move rule whereby a game in which no irreversible moves have been played for 100 plies can be claimed as a draw by the player 'on move'. In this metric, in addition to the DTC/DTM/DTZ draws, positions requiring more than 50 winner's moves to zero the ply-count are defined as draws. DTZ £ DTZ50 if DTZ50 ¹ draw. 'DTZ50' is one of a set of 'DTZk' rules: a k-move rule effectively applies if there are only k moves left before a draw-claim might be made. k1 £ k2 implies DTZk2 £ DTZk1 if DTZk1 ¹ draw.

Note that, because depth is measured in winner's moves rather than plies, this metric does not define as draw a position in which 50 winner's moves and 51 loser's moves are required to zero the ply-count, despite the fact that the loser would claim a draw after 100 plies had been played in that endgame.

DTR

Depth by the (k-move) Rule 'The Rule' referred to here is a notional set of k-move rules. For each position that is represented, 'DTR' indicates the theoretical value and, for won/lost positions, the least k for which DTZk is not draw - assuming that the winner is minimising and the loser is maximising DTR. DTZ £ DTR. No DTR EGTs have been created to date [3] .

DTZR

Depth to Zeroing move, give that DTR is being minimaxed If DTR = draw, then DTZR = draw. Otherwise, Otherwise DTRZR indicates the theoretical value, and the number of winner's moves to the zeroing of the ply count, given that the two sides are minimaxing first DTR and then DTZR. DTZR would be used in conjunction with DTR to progress a deep win in the context of a k-move rule because DTR can remain unchanged over a sequence of moves. DTZR £ DTR but it is not always the case that DTZ £ DTZR as may seem intuitively obvious: the loser may zero the move-count to maximise DTR.

Summary

To summarize the different metrics of tablebases, each metric has its strengths and weaknesses. Nalimov's DTM EGTs are widely and commercially available, and used by many chess engines or chess GUIs. DTC EGTs are more easily computed and stored as they involve smaller depth-ranges and are more compressible. For endgames with less or equal than 5 pieces on the board, KNNKP comes closest to requiring more than one Byte per position to store DTM: maxDTM = 115 (1-0) and 73 (0-1) necessitating 190 values in the EGT. For KPPKP, maxDTM = 127 (1-0) but only 42 (0-1). But for certain 6 piece endgames, one Byte is not enough: maxDTM = 262 for a KRNKNN position. The DTC and DTZ metrics increasingly postpone the need for two Bytes per position in an uncompressed EGT, but KRBNKQN has maxDTC = maxDTZ = 517 (0-1), both wtm and btm.

In extremis, he unmoderated use of DTM, DTC or DTZ EGTs will let slip a winnable position in the context of the 50-move rule ... hence the marginal need for the DTR and DTZR metrics, coupled with the ability to recompute DTR/DTZR EGTs in the context of a limited move-budget.

Bitbases

see main article Endgame Bitbases

Inside the search much denser tables with value {won, draw, loss, invalid} or boolean {win, not_win} information are sufficient with respect to bounds. Denser tables are more efficiently used, given that memory and cache are limited resources. Boolean Win/Not_Win Bitbases require only one bit per position (and if necessary, further enquiry if draw is to be distinguished from loss with respect to bounds). Value Bitbases require two bits per position.

Indexing

Every position is assigned to a unique index to specify the location of the information stored about it. The main purpose of indexing is to locate and read information (from databases) such as distance to mate for a given position based on its index.

There are two main aims when generating the index number. The easier the algorithm to generate the index from any given position, the faster will be the probing of the tablebase as well as the generation of it. But the second aim is to get a relatively small range of index-numbers, to keep the size of the file as small as possible.

The most straight forward way would be to for every piece on the board use 6 bit (1-64 squares)

index = position_white_King + position_black_King * 64

for each piece {
    index = index * 64 + position
}

Almost all work on this field is to reduce index spaces and reduce data sizes (e.g., by compressing). The above indexing can be optimised in several ways.

Remove redundant symmetrical indexes:

  • A chess board can be rotated and mirrored (if pawns are on the board only a vertical reflection is possible).
  • For pieces of the same type, different arrangements can be left away.

Remove illegal indexes:

  • Pieces may not stand on the same square.
  • The side to move may not give check.
  • The distance between the two Kings must be greater than one.

Retrieving

Probe

When a chess program (e.g., a chess engine or a chess GUI) works with an EGTB, it needs to retrieve data (integer numbers) for some given chess positions. The process named probe and has below steps:

  • convert a given chess position into an index via indexing algorithms
  • access data (typically organised as arrays of integers) using the index to retrieve the integer of that chess position

If data of that endgame is not ready in memory, usually the probe process has to do some extra work:

  • From data index calculate the block index
  • Read data of that block from storage into memory
  • Uncompress the block data (if it is compressed)
  • Convert the endgame index into block index and retrieve the needed data

Search

After probing the chess engine may have the result as an integer number for the queried position. Depending on the metric, the number may be a distance to mate or distance to conversion, the position is a draw or a win or loss for the querying side. Sometimes that value (state of draw/win/loss) may be enough for searching. However, sometimes that state is not enough and it needs to know which is the best move to make. The solution is to generate all legal moves from that position, make one by one, probe that EGTB for values of each new position. After all, compare the values of all children's positions to find which one is the best to make.

Speed

Typically the probe process requires some computing and it may access storage and decompress which is usually so slow. An engine that probes EGTB when searching may make the whole search be slower. Sometimes the benefit from probing EGTBs maybe not enough to cover the loss of slower search. That’s why the probing should be planned and implemented carefully. EGTB files may have to store in fast storage and/or using some huge systems’ caches. The good point is that the engine can stop searching on the branch probed the EGTB.

Brief Info

of some popular Endgame tablebases:

Format Metric 1st published 5 men 6 men 7 men
Thompson DTC 1991 2.5 GiB (not completed) - -
Edwards DTM 1994 56 GiB (estimated) - -
Nalimov DTM 1998 7.1 GiB 1.2 TiB -
Scorpio WDL 2005 214 MiB 48.1 GiB -
Gaviota DTM 2008 6.5 GiB - -
Lomonosov DTM 2012 - - 140 TiB
Syzygy WDL + DTZ50 2013 / 2018 939 MiB 150.2 GB 17 TB

Formats

Generators

7-man EGTBs

2005

In 2005, Yakov Konoval started to collaborate with Marc Bourzutschky on constructing 7-man EGTBs. Yakov wrote the generator and Marc a verification program and some utilities for extracting the data. In October 2005, Yakov Konoval and Marc Bourzutschky announced that a position in the ending of a KRRNkrr requires 290 moves to convert to a simpler winning endgame [4] . The old record was 243 moves from a position in a rook and knight versus two knights endgame, discovered by Lewis Stiller in 1991 [5] .

2006

In March 2006, the wizards of 7-men endgames, Marc Bourzutschky and Yakov Konoval found a 330 moves win in KQBNkqb [6] and in May 2006 a 517 moves win in KQNkrbn [7] .

2012

In July 2012, Vladimir Makhnychev and Victor Zakharov from Moscow State University completed 4+3 DTM-tablebases (525 endings including KPPPKPP). The next set of 5+2 DTM-tablebases was completed during August 2012 [8] . The tablebases are named Lomonosov Tablebases, using the Lomonosov Supercomputer.

2013

Convekta Ltd. announced the release of Lomonosov Tablebases. Until December 31, 2013, the tablebases will be available online through ChessOK products [9].

2018

Starting in March 2018, seven piece Syzygy Bases, originally created for up to six pieces by Ronald de Man in 2013, were generated by Bojun Guo, finished in August 2018 [10].

See also

Selected Publications

1965 ...

1970 ...

1975 ...

1985 ...

1986

1987

1988

1989

1990 ...

1991

1992

1993

1994

1995 ...

1996

1997

1999

2000 ...

2001

Guy Haworth (2001). Discarding Like Pieces. 6th Computer Olympiad Workshop
Guy Haworth (2001). Depth by The Rule. 6th Computer Olympiad Workshop
Guy Haworth (2001). 3-5-Man Chess Data. 6th Computer Olympiad Workshop
Guy Haworth (2001). 3-5-Man Mutual Zugzwangs in Chess. 6th Computer Olympiad Workshop

2002

2003

2004

2005 ...

2006

2008

2009

2010 ...

2011

2013

2014

2015 ...

2016

2017

2018

2019

2020 ...

Forum Posts

1995 ...

overlapping tablebase lookup by Jay Scott, CCC, February 09, 1999

2000 ...

Wu/Beal predates Koistinen by Guy Haworth, CCC, December 04, 2001

2005 ...

Re: Endgame Metrics by John Kominek, CCRL Discussion Board, March 12, 2008

2010 ...

2011

2012

2013

2014

2015 ...

2016

Re: Nalimov EGTB problem related to DTM? by Ronald de Man, CCC, February 14, 2016
EPD to test EGTB probe code by Sergei S. Markoff, CCC, May 24, 2016

2017

2018

2019

2020 ...

2021

Re: knbk DTM by Harm Geert Muller, CCC, May 05, 2021 » Marcel van Kervinck's pretty fast kbnk generator

2022

External Links

Compression Schemes for Gaviota Tablebases

GitHub

ChessBase

ChessOK

Online Lookup

Download

Statistics

References

  1. Chess in the Art of Samuel Bak, Center for Holocaust & Genocide Studies, University of Minnesota
  2. Ernst A. Heinz (1999). Endgame Databases and Efficient Index Schemes for Chess. ICCA Journal, Vol. 22, No. 1, ps
  3. DTR is no good! by Harm Geert Muller, CCRL Forum, September 24, 2008
  4. Subject: KRRNKRR win in 290: a new record by Marc Bourzutschky from CCC, October 16, 2005
  5. Chess endgame from Wikipedia
  6. 311. 31 March 2006: White plays and wins in 330 moves by Tim Krabbé
  7. OPEN CHESS DIARY 316. 26 May 2006: A 517-move win by Tim Krabbé
  8. Endgame tablebase from Wikipedia - Background
  9. Lomonosov Endgame Tablebases - ChessOK
  10. Re: 7-men Syzygy attempt by Bojun Guo, CCC, August 19, 2018
  11. Subject: brute-force vs. intuition in math & chess by Bill Dubuque, August 1996
  12. Jesper Torp Kristensen (2005). Generation and compression of endgame tables in chess with fast random access using OBDDs. Master thesis, supervisor Peter Bro Miltersen, Aarhus University
  13. Knowledge discovery from Wikipedia
  14. Chris Wirth's endgame tablebases by Denis P. Mendoza, CCRL Endgame Tablebases, December 13, 2007 » Chessterfield
  15. Kasparov versus the World from Wikipedia
  16. Generating egtbs ICGAJ by Tony Werten, CCC, December 04, 2001, with reference to Computing endgames with few men by Urban Koistinen
  17. Wu / Beal retrograde analisys algorithm by Alvaro Jose Povoa Cardoso, Winboard Forum, March 10, 2007
  18. OBDD - Ordered Binary Decision Diagram from Wikipedia
  19. Plakoto from Wikipedia
  20. Endgame tablebases with the fifty-move rule by Galen Huntington
  21. tablebase compression / academic integrity by Ronald de Man, CCC, May 19, 2016
  22. Computing endgames with few men by Urban Koistinen
  23. Ren Wu, Don Beal (2001). Fast, Memory-efficient Retrograde Algorithms. ICGA Journal, Vol. 24, No. 3
  24. Computing endgames with few men by Urban Koistinen
  25. Easy mate by Frank Phillips, CCC, September 13, 2003 » Checkmate
  26. A018214 - OEIS | Alkane (or paraffin) numbers
  27. Lomonosov Endgame Tablebases - ChessOK
  28. kervinck/pfkpk · GitHub
  29. io_uring.pdf
  30. Gaviota tablebases, probing code v4 (UPDATE) by Miguel A. Ballicora, CCC, March 11, 2011
  31. EGTB: Better algorithm by Urban Koistinen, CCC, April 07, 2001
  32. Generating egtbs ICGAJ by Tony Werten, CCC, December 04, 2001
  33. Ren Wu, Don Beal (2001). Fast, Memory-efficient Retrograde Algorithms. ICGA Journal, Vol. 24, No. 3
  34. New 6-piece tablebases by Ronald de Man, CCC, April 01, 2013
  35. Scorpio 6men EGBB Now available by Joshua Shriver, CCC, January 14, 2014
  36. Yet another KPK endgame table generator: pfkpk by Marcel van Kervinck, CCC, September 05, 2015

Up one level