Persistent Hash Table

From Chessprogramming wiki
Jump to: navigation, search

Home * Learning * Persistent Hash Table

Persistent Hash Table, (persistent transposition table)
a form of long-term memory, to remember "important" nodes from earlier analyzed positions or played games with its exact score and depth, in order to avoid repeating unsuccessful book lines. So called orthogonal or transparent persistent hash tables preserve their contents with focus on PV-nodes between moves while playing a game, between games, and even during interactive analysis while playing variations forward and backward [2]. Non-orthogonal persistence, primary topic of this page, requires data to be written or read to or from storage devices using specific instructions in a program and have to provide mappings from or to the native data structures to or from the storage device data structures.

Inspired by the rote learning as used in Arthur Samuel's Checkers program from 1959 [3] [4], David Slate first described a persistent transposition table in computer chess for accumulating selected information from many games and then utilizing it subsequently via the transposition table [5]. In his article, Slate mentions personal communication with Tony Scherzer and Tony Warnock regarding learning in Bebe and Lachex.

Learning in Mouse

Slate's simple brute-force program Mouse, a depth-first, full-width iterated alpha-beta searcher with an evaluation purely based on material was used as a learning testbed, only remembering positions where a significant score drop occurred at the root.

Transposition Table

A relative small transposition table of 4096 buckets (8192 entries) was used for the examples in his paper:

 6   bytes Hash code
 2x2 bytes Score window
 2   bytes Best Move, if any
 10  bits  Game ply
 6   bits  Ply from root node
 1   byte  Search height
 1   byte  Origin indication

Score window holds the current lower and upper bounds. Although some programs use only a single value and a flag for good, lower or upper bound, there are occasions and algorithms [6] that fix the score between unequal bounds neither of which is infinite.

Origin indication holds the explanation of the origin for the entry. There are values for current search, human advice, learned in previous session, etc.

Algorithm

The basic learning algorithm stores root entries to disk, if the final score of the chosen move is significantly worse than the best score in any of the previous iterations. Between searches during the playing session, relevant portions of the retained entries were loaded into their slots in the TT-table, adjusting bounds by a fuzz term, and to flag their origin to secure them from being indiscriminately overwritten.

Learning in Bebe

Tony and Linda Scherzer, and Dean Tjaden further elaborate on the persistent hash table in their award winning paper [7] concerning Learning in Bebe:

Short Term Memory

The short term memory (STM) or transposition table slot consists of 16 bytes, of which 12 are stored, and 4 bytes are implicit as a memory address. Upper and lower limit of the score are needed for the easiest implementation of the learning algorithm:

 4 bytes Hash code used as STM memory address
 4 bytes Hash code used for match verification
 2 bytes Search height
 2 bytes Position-score lower limit
 2 bytes Position-score upper limit
 2 bytes The move

Long Term Memory

The long term memory (LTM) entries are stored on disk and therefor retained between games. The structure is similar, however all 16 bytes were stored:

 4 bytes Hash code used as STM memory address
 4 bytes Hash code used for match verification
 2 bytes Depth of search
 2 bytes Move number
 2 bytes Position-score
 2 bytes The move

One LTM entry is created for each root node during the game.

Algorithm

The algorithm consists of two phases. One creates (or overwrites) the LTM entries at the end of each search considering a contempt factor, while the second transforms and copies LTM entries to STM at the start of each search:

 Position-score lower limit = Position-score - fuzzy tolerance (up to 0.2 pawn units for none draw or mate scores)
 Position-score upper limit = Position-score + fuzzy tolerance

Position Learning in Crafty

Quote from Crafty Command Documentation (version 18) by Robert Hyatt [8]:

What is this new Position Learning I've heard about?

Crafty now has a "permanent" hash table that is kept from game to game. A position gets into this "hash file" when Crafty executes a search and the search value is significantly lower than the last search value.

When this happens, Crafty stores the current information for this position in the permanent hash file, which can hold up to 65536 positions. Once it fills up, the positions are replaced on a FIFO basis, always keeping the most recent 64K entries.

Each time Crafty starts a search, the positions/scores from this file are stuffed into the normal transposition table, and used during the search just like any other table entry...

See also

Selected Publications

Forum Posts

1990 ...

2000 ...

2010 ...

Re: My "official" request to top engine programmers by Harm Geert Muller, CCC, July 05, 2017

2020 ...

External Links

References

  1. Salvador Dalí, The Disintegration of the Persistence of Memory from Wikipedia, 1952-1954
  2. Re: My "official" request to top engine programmers by Harm Geert Muller, CCC, July 05, 2017
  3. Arthur Samuel (1959). Some Studies in Machine Learning Using the Game of Checkers. IBM Journal July 1959
  4. Arthur Samuel (1967). Some Studies in Machine Learning. Using the Game of Checkers. II-Recent Progress. pdf
  5. David Slate (1987). A Chess Program that uses its Transposition Table to Learn from Experience. ICCA Journal, Vol. 10, No. 2
  6. David Slate (1984). Interior-node Score Bounds in a Brute-force Chess Program. ICCA Journal, Vol. 7, No. 4
  7. Tony Scherzer, Linda Scherzer, Dean Tjaden (1990). Learning in Bebe. Computers, Chess, and Cognition » Mephisto Best-Publication Award
  8. Crafty Command Documentation (version 18) - What is this new Position Learning I've heard about?

Up one Level