Zobrist Hashing

Home * Search * Transposition Table * Zobrist Hashing



Zobrist Hashing, a technique to transform a board position of arbitrary size into a number of a set length, with an equal distribution over all possible numbers, invented by Albert Zobrist. In an early Usenet post in 1982, Tom Truscott mentioned Jim Gillogly's n-bit hashing technique, who apparently read Zobrist's paper early, and credits Zobrist in a 1997 rgcc post. Zobrist Hashing is an instance of tabulation hashing, a method for constructing universal families of hash functions by combining table lookup with exclusive or operations. Zobrist Hashing was rediscovered by J. Lawrence Carter and Mark N. Wegman in 1977 and studied in more detail by Mihai Pătrașcu and Mikkel Thorup in 2011.

The main purpose of Zobrist hash codes in chess programming is to get an almost unique index number for any chess position, with a very important requirement that two similar positions generate entirely different indices. These index numbers are used for faster and more space efficient Hash tables or databases, e.g. transposition tables and opening books.

=Metamorphosis= M. C. Escher, Metamorphosis III, 1967-1968

=Initialization= At program initialization, we generate an array of pseudorandom numbers :
 * One number for each piece at each square
 * One number to indicate the side to move is black
 * Four numbers to indicate the castling rights, though usually 16 (2^4) are used for speed
 * Eight numbers to indicate the file of a valid En passant square, if any

This leaves us with an array with 781 (12*64 + 1 + 4 + 8) random numbers. Since pawns don't happen on first and eighth rank, one might be fine with 12*64 though. There are even proposals and implementations to use overlapping keys from unaligned access up to an array of only 12 numbers for every piece and to rotate that number by square.

Programs usually implement their own Pseudorandom number generator (PRNG), both for better quality random numbers than standard library functions, and also for reproducibility. This means that whatever platform the program is run on, it will use the exact same set of Zobrist keys. This is also useful for things like opening books, where the positions in the book can be stored by hash key and be used portably across machines, considering endianness.

=Runtime= If we now want to get the Zobrist hash code of a certain position, we initialize the hash key by xoring all random numbers linked to the given feature, e.g. the initial position: [Hash for White Rook on a1] xor [Hash for White Knight on b1] xor [Hash for White Bishop on c1] xor ... ( all pieces ) ... xor [Hash for White king castling] xor [Hash for White queeb castling] xor ... ( all castling rights )

The fact that xor-operation is own inverse and can be undone by using the same xor-operation again, is often used by chess engines. It allows a fast incremental update of the hash key during make or unmake moves. E.g., for a White Knight that jumps from b1 to c3 capturing a Black Bishop, these operations are performed: [Original Hash of position] xor [Hash for White Knight on b1] ... ( removing the knight from b1 ) ... xor [Hash for Black Bishop on c3] ( removing the captured bishop from c3 ) ... xor [Hash for White Knight on c3] ( placing the knight on the new square ) ... xor [Hash for Black to move] ( change sides)

=Collisions= Key collisions or type-1 errors are inherent in using Zobrist keys with far less bits than required to encode all reachable chess positions.

Theory
An important issue is the question of what size the hash keys should have. Smaller hash keys are faster and more space efficient, while larger ones reduce the risk of a hash collision. A collision occurs if two positions map the same key. The dangers of which were well assessed by Robert Hyatt and Anthony Cozzie in their paper Hash Collisions Effect. Usually 64bit are used as a standard size in modern chess programs.

Hash collisions demonstrate the birthday "paradox", which is to say the chance of collisions approaches certainty at around the square root of the number of possible keys, contrary to some people's expectations. You can expect to encounter a collision in a 32 bit hash when you have evaluated sqrt(2 ^ 32) == 2 ^ 16 or around 65 thousand positions. With a 64 bit hash, you can expect a collision after about 2 ^ 32 or 4 billion positions.

Praxis
Post by Jonathan Schaeffer : ... I can speak from experience here. In the early versions of my chess program Phoenix, I generated my Zobrist hash numbers using my student id number as a seed, naively thinking the random numbers generated by this seed would be good enough. A few years later I put code in to detect when my 32-bit hash key matched the wrong position. To my surprise, there were lots of errors. I changed my seed to another number and the error rate dropped dramatically. With this better seed, it became very, very rare to see a hash error. All randomly generated numbers are not the same!

Lack a True Integer Type
Some languages (such as JavaScript and Lua) only have a 64-bit floating point "Number" type. In JavaScript, this type breaks down into a 32 bit integer when bitwise operators are used. One way to get a 64 bit hash is to use two 32 bit numbers in parallel, as Garbochess-JS does. Another, which p4wn used at one stage, is to use 47 or 48 bit additive hashes. 64 bit floating point numbers are true integers up to 53 bits, so it is possible to sum at least 32 (and on average close to 64) random 48 bit numbers, which was enough for p4wn's purposes. For additive Zobrist hashing, you add the number when placing a piece and subtract it when removing it, rather than using xor both ways. There is no difference in accuracy or speed, and 48 bit hashes give you collisions at around the 2 ^ 24 or 16 million point.

Linear Independence
The minimum and average Hamming Distance over all Zobrist keys was often considered as "quality"-measure of the keys. However, maximizing the minimal hamming distance leads to very poor Zobrist keys. As long the minimum hamming distance is greater zero, linear independence (that is a small subset of all keys doesn't xor to zero), is much more important than hamming distance as explained by Sven Reichard :

Assume we associate a bitstring to every piece-square combination. That is what's usually done in chess programs; some codes are added for the side to move, castling rights, e.p. squares, etc. We obtain the code of a position by xor-ing the codes of all the pieces contained in it.

What we want to avoid is collisions at nodes close to the root. For nodes close to the leaves the cost of recomputing the score is smaller. Hence we want to avoid that: x1^x2^...^xm = y1^y2^...^yn for codes xi, yi and small number m and n, and xi not equal to yj To translate that to a language that is more familiar - at least for people of a mathematical background - we consider the field F2 of two elements. The elements are 0 and 1, and we can add and multiply them as usual, with the additional rule that 1 + 1 = 0. This is really a field, just like the real or complex numbers, and we can do calculations as usual. Note that addition is just the exclusive or.

Now the codes or bitstrings become vectors over the field F2, and the bitwise exclusive or becomes componentwise addition, i.e., usual addition of vectors. All these vectors form the vector space F2^k, where k is the length of the vectors. Typically, k = 64.

So, what we want to avoid is an equation x1 + x2 + ... + xm = y1 + y2 + ... + yn or x1 + x2 + ... + xm + y1 + y2 + ... + yn = 0 since in F2, subtraction is the same as addition. Remembering some linear algebra, this just means that we want the set x1,...,xm,y1,...,yn to be linearly independent.

This leads to the following criterion for picking a set of hashcodes: A set of vectors in F2^k is a good set of hash codes if each small subset of non-zero vectors is linearly independent. What is not clear here is the meaning of "small", but we want small to be as big as possible. In other words, we consider sets of size up to a certain size as small, and if we can make that size bigger, it is better, since this leads to unique codes deeper in the tree.

However what is clear is that this quality criterion does not depend on the base of the vector space. I.e., if we have a good set and multiply each vector by an invertible matrix (in other words, if we rotate the vectors), the obtained set will be just as good, since the rotation does not change the linear independence. The Hamming distance, on the other hand, is highly dependent on the vector space base. Take for example the vectors (1,0) and (0,1) in F2^2; they have Hamming distance 2. If we multiply both of them by (1 1) (0 1) we get (1,1) and (0,1), which have Hamming distance 1. Actually we can change any distance to anything else (except for 0) by an appropriate matrix. Thus we try to approximate something that is independent from the base (the quality of our hash codes) by something that depends on it (the Hamming distance). Simple logic tells you that this approximation has to be real bad. An example where it doesn't work: It has been said that the Hamming distance shouldn't be to small or to big. So, vectors at a distance which is half the length should be ok, right? Let the length be 8 (I don't want to type too many 0's and 1's), and consider the vectors 11110000 11001100 00111100 They all have weight 4, their pairwise distance is 4, and yet they add up to 0. Just by looking at Hamming distances, you have no chance of detecting that.

Summarizing I can say that I see no connection between the quality of hash codes and their Hamming distance. Using a good RNG like the one provided in GNU's stdlib will yield good hash codes ( you can actually prove that), and so I will take the codes as they are supplied by rand or random without messing with them and thereby most likely make them worse.

=See also=
 * CPW-Engine_transposition
 * BCH Hashing

=Publications=
 * Albert Zobrist (1970). A New Hashing Method with Application for Game Playing. Technical Report #88, Computer Science Department, The University of Wisconsin, Madison, WI, USA. Reprinted (1990) in ICCA Journal, Vol. 13, No. 2, pdf
 * J. Lawrence Carter, Mark N. Wegman (1977). Universal classes of hash functions. STOC '77
 * Robert Hyatt, Anthony Cozzie (2005). The Effect of Hash Signature Collisions in a Chess Program. ICGA Journal, Vol. 28., No. 3
 * Borko Bošković, Sašo Greiner, Janez Brest, Viljem Žumer (2005). The Representation of Chess Game. Proceedings of the 27th International Conference on Information Technology Interfaces
 * Mihai Pătrașcu, Mikkel Thorup (2011). The Power of Simple Tabulation Hashing. arXiv:1011.5200v2

=Forum Posts=

1982 ...

 * compact representation of chess positions by Tom Truscott, net.chess, January 7, 1982

1990 ...

 * Hash tables - Clash!!! What happens next? by Valavan Manohararajah, rgc, March 15, 1994
 * Re: Hash tables - Clash!!! What happens next? by Jonathan Schaeffer, March 17, 1994


 * Collision probability by Dennis Breuker, rgcc, April 15, 1996
 * Re: Berliner vs. Botvinnik Some interesting points by Bradley C. Kuszmaul, rgcc, November 6, 1996
 * Re: Hashing function for board positionsby Jim Gillogly, rgcc, May 12, 1997
 * Fast hash algorithm by John Scalo, CCC, January 08, 1998
 * Fast hash key method - Revisited! by John Scalo, CCC, January 14, 1998
 * How to create a set of random integers for hashing? by Ed Schröder, CCC, October 18, 1998

2000 ...

 * Why Random Number Needed In HashFunction[piece[position]] by Cheok Yan Cheng, rgcc, June 12, 2001
 * About random numbers and hashing by Severi Salminen, CCC, December 04, 2001
 * Random keys and hamming distance by James Swafford, CCC, August 16, 2002
 * Hamming distance and lower hash table indexing by Tom Likens, CCC, September 02, 2003
 * 64-Bit random numbers by Martin Schreiber, CCC, October 28, 2003
 * Is it necessary to include empty fields in the hash key of a position? by Frank Hablizel, rgcc, December 25, 2003
 * Hashkey collisions (typical numbers) by Renze Steenhuisen, CCC, April 07, 2004

2005 ...

 * Zobrist key random numbers by Robert Hyatt, CCC, January 21, 2009
 * Incremental Zobrist - slow? by Vlad Stamate, CCC, June 20, 2009 » Incremental Updates
 * On Zobrist keys by Lasse Hansen, CCC, June 21, 2009
 * Overlapped Zobrist keys array by Stefano Gemma, CCC, October 06, 2009

2010 ...

 * Transposition table random numbers by Justin Madru, CCC, July 13, 2010
 * TT Key Collisions, Workarounds? by Clemens Pruell, CCC, August 16, 2011
 * Key collision handling by Jonatan Pettersson, CCC, October 21, 2011
 * Using a Transposition Table with Zobrist Keys by Miyagi403, OpenChess Forum, February 21, 2012
 * MT or KISS ? by Dan Honeycutt, CCC, June 02, 2012
 * Zobrist alternative? by Harm Geert Muller, CCC, June 12, 2012
 * Zobrist Number Statistics and WHat to Look For by Andrew Templeton, CCC, October 16, 2012
 * Question about Zobrist code by Hamfer, OpenChess Forum, December 19, 2012

2015 ...

 * Zobrist keys - measure of quality? by Martin Sedlak, CCC, February 24, 2015
 * On-the fly hash key generation? by Evert Glebbeek, CCC, January 12, 2016
 * Re: On-the fly hash key generation? by Aleks Peshkov, CCC, January 13, 2016


 * Rotated hash by J. Wesley Cleveland, CCC, September 13, 2016
 * No Zobrist key by Henk van den Belt, CCC, September 26, 2016
 * Enpass + Castling for Zorbist hashes by Andrew Grant, CCC, January 06, 2017 » Castling Rights, En passant
 * Zobrist hashing for text by Alvaro Cardoso, CCC, January 20, 2018

=External Links=
 * Zobrist hashing from Wikipedia
 * Tabulation hashing from Wikipedia
 * Zobrist keys from Bruce Moreland's Programming Topics
 * Zobrist keys from Mediocre Chess by Jonatan Pettersson
 * Gödel numbering from Wikipedia
 * John Cage - Music of Changes, Book 1 (1951), performed by Vicky Chow, DiMenna Center, NYC, June 09, 2012, YouTube Video

=References=

Up one Level