Home * Board Representation * Move Generation * Table-driven Move Generation

Table-driven Move Generation, is a technique to speed up the common traversal of pieces with their origin (from square) from a piece-list and their potential move target squares by traversing linked lists with pre-calculated moves, which head is indexed by piece-type and origin. There are three control structures to cover the disjoint move abilities for king or knights, pawns, and sliding pieces, while we focus on the latter for the most general case.

Prologue

The classical technique in sliding piece move generation with square centric board arrays, is to loop over all ray-directions, and per ray-direction from the next closest square (if any) to the farthest square of that ray until either the target square is occupied by an own piece, or occupied by opponent piece which requires a capture generation. The control structure, its code size, number of local and register variables, and conditional branches on ray-termination, that is on piece-code combined with farthest square or off-the board tests, were the driving force in designing efficient board representations like 0x88 and mailbox in conjunction with appropriate piece coding. One example is the piece coding in Gambiet to conditionally branch on several Z80 processor flags, set by only one shift instruction. For a 8x8 board there was already a proposal by Alex Bell as used in Atlas ^[1], to lookup ray-increments per ray-direction and terminal squares per ray-direction and origin - sample code in C++ pseudo code:

for (all directions[piece_kind] -> dir) {
  delta  = delta_table[dir];
  lastsq = lastsq_table[dir][fromsq];
  if ( fromsq != lastsq ) {
    for (tosq = fromsq + delta; board[tosq].isNotPieceOf(side2move); tosq += delta) {
      generateMove( fromsq, tosq );
      if ( board[tosq].isPiece() || tosq == lastsq )
        break; /* next direction after capture or last square */
    }
  }
}

GNU Chess

Inspired by Hardware

Table-driven move generation was pioneered by Hans Eric Sandström in 1989 ^[2] as introduced in GNU Chess Version 1.55 ^[3] :

New move Generation algorithm: Revision: 1989-09-06 Author: Hans Eric Sandstroem. This algorithm is the result of an attempt to make an hardware move generator, but since I newer had the time and resources to build the hardware I wrote a software version and incorporated that one into gnu chess. This was the best way I could think of sharing this algorithm with the computer chess community. If there is anybody out there with the time and resources to build a hardware move generator I will be glad to assist. The general idea behind this algorithm is to pre calculate a lot of data. The data that is pre calculated is every possible move for every piece from every square disregarding any other pieces on the board. This pre calculated data is stored in an array ...

Butterfly Tables

The GNU Chess implementation used the natural branches on empty square, and if occupied and the next direction is taken, on own or opponent piece. Piece kind, origin and target square index an array of eight Butterfly boards with 64 Kib in total with 16-bit words while bytes might suffice as next direction or next position squares as well. This is very register but less memory friendly due to the Butterfly layout. Sentinel nodes use the origin square to terminate the do-while loop ^[4]:

struct sqdata {
  short nextpos;
  short nextdir;
};
struct sqdata posdata[8][64][64];

...
struct sqdata *p = posdata[piece_kind][fromsq];
tosq = p[fromsq].nextpos;
do {
  if (color[tosq] == neutral) {
    linkMove(ply, fromsq, tosq, xside);
    tosq = p[tosq].nextpos;
  }
  else {
    if (color[tosq] == xside) 
       linkMove(ply, fromsq, tosq, xside);
    tosq = p[tosq].nextdir;
  }
} while (tosq  != fromsq);

Further Implementations

There are zillions of implementation nuances concerning disjoint move-lists for quiet moves and captures, and space-time tradeoff.

Ferret

The GNU Chess approach was refined by Bruce Moreland, as used in Ferret, and elaborated in his programming topics, for instance for a bishop on d5 ^[5]:

typedef struct tagMT {
  int    square;
  struct tagMT * pmtBlocked;
} MT;

MT argmtBD5[] = {
    e6,    &argmtBD5[3], // Element 0, 1st ray.
    d7,    &argmtBD5[3],
    f8,    &argmtBD5[3],
    c6,    &argmtBD5[3], // Element 3, 2nd ray.
    b7,    &argmtBD5[6],
    a8,    &argmtBD5[6],
    e4,    &argmtBD5[6], // Element 6, 3rd ray.
    f3,    &argmtBD5[6],
    g2,    &argmtBD5[6],
    h1,    NULL,         // Element 9, 4th ray.
    c4,    NULL,
    b3,    NULL,
    a2,    NULL,
    -1,
};

void genMoves(MT * pmt)
{
  for (;;) {
    if (IsEmpty(pmt->square)) {
      GenMove(pmt->square);
      if ((++pmt)->square < 0)
        break;
    } else {
      if (IsEnemy(pmt->square))
        GenMove(pmt->square);
      if ((pmt = pmt->pmtBlocked) == NULL)
        break;
    }
  }
}

Diep

Vincent Diepeveen used a similar technique in Diep. He published a move generation skeleton under the GPL ^[6] in CCC ^[7] ^[8] ^[9] ^[10]. Vincent's approach uses the default ones-increment of the list pointer, and adds a further skip-increment to the next direction in case of a block, which is done branch-less by intersection of the increment with a mask for target square empty (0) or not (-1), and was inspiration for the conditional linked list approach, which stresses memory versus a simplified control structure.

Conditional Linked List

To minimize code and to avoid branches on todays super pipelined processors, following recursive data structure of a pre-initialized skip list with two alternative next references might be applied for a prototype of table-driven move generation.

Data Structure

One sliding move node consists of the target square, the bit redundant move itself (might already contain a score based on dedicated piece-square tables concerning move ordering), and an array of two pointers to the next square if any. During traversal, the next array is indexed by the condition whether the target square is empty (0, false) or occupied (1, true), either to arrive the next square on the current ray, or the first square of the next ray-direction if any.

struct SlidingMoveNode
{
  TSquare tosq;
  SMove   move;
  SlidingMoveNode* pNext[2];
};

The number of sliding move node structures is determined by the influence quantity of pieces, where bishops have a cardinality of 560, rooks of 896, and queens the sum of 1456, and they are typically declared as array with consecutive nodes and almost always pNext[0] = this + 1. However, despite avoiding conditional branches, this redundancy a little bit pays off since this approach does not require sentinel nodes with one extra indirection to check for an invalid target square for termination, due to storing sentinel values, that is null pointers aka nil.

SlidingMoveNode  slidingMoveTable[1456+896+560]; /* queen, rook, bishop ->  2912 * 16 = 46592 bytes 45.5 Kib */
SlidingMoveNode* slidingMoveHead[3][64];

With following initialization, for instance for a rook on e4:

slidingMoveHead[rook][e4]
     ▼ north                                   east           south          west
 ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐  ┌────────┐─┐-┐ ┌────────┐─┐-┐ ┌────────┐─-┬-┌────────┐
 │   e5   │ │   e6   │ │   e7   │ │   e8   │  │   f4   │     │   e3   │     │   d4   │    │   a4   │
 ├────────┤ ├────────┤ ├────────┤ ├────────┤  ├────────┤g4   ├────────┤e2   ├────────┤c4  ├────────┤
 │ Re4e5  │ │ Re4e6  │ │ Re4e7  │ │ Re4e8  │  │ Re4f4  │  h4 │ Re4e3  │  e1 │ Re4d4  │  b4│ Re4a4  │
 ├────────┤ ├────────┤ ├────────┤ ├────────┤  ├────────┤     ├────────┤     ├────────┤    ├────────┤
 │pNext[0]│►│pNext[0]│►│pNext[0]│►│pNext[0]│►─│pNext[0]│►─...│pNext[0]│►─...│pNext[0]│►─..│   NIL  │
 ├────────┤ ├────────┤ ├────────┤ ├────────┤  ├────────┤     ├────────┤     ├────────┤    ├────────┤
 │pNext[1]│ │pNext[1]│ │pNext[1]│ │pNext[1]│►─│pNext[1]│     │pNext[1]│     │   NIL  │    │   NIL  │
 └────────┘ └────────┘ └────────┘ └────────┘  └────────┘─┘-┘ └────────┘─┘-┘ └────────┘--┴-└────────┘
       ▼          ▼          ▼                  ▲   ▼          ▲   ▼          ▲
       │          │          └───►──────────►───┘   │          │   │          │
       │          └───►──────────►──────────►───┘   │     ... ─┘   │     ... ─┘
       └───►──────────►──────────►──────────►───┘   └───►─...──┘   └───►─...──┘

Control Structure

This data structure simplifies the control structure to generate moves of one sliding piece with only one conditional branch concerning the while loop. To avoid the branch on blocked by own piece, a conditional write taking advantage of write-combining ^[11] is implemented, storing always, but post-increment the move list pointer by a boolean aka {0,1} condition:

pMoveNode = slidingMoveHead[slidingPiece_kind_012][fromSq];
do {
  tosq = pMoveNode->tosq;
  *pMoveList = pMoveNode->move;
  pMoveList += board[tosq].isNotPieceOf( side2move );
  pMoveNode  = pMoveNode->pNext[ board[tosq].isPiece() ];
} while ( pMoveNode );

Captures vs Quiet

To generate disjoint sets of captures and quiet moves in different move generation stages, the same conditional linked lists could be traversed, stressing conditional writes in using different conditions for the move-list pointer increment.

/* Captures  */
pMoveNode = slidingMoveHead[slidingPiece_kind_012][fromSq];
do {
  tosq = pMoveNode->tosq;
  *pMoveList = capture( pMoveNode->move, board[tosq]);
  pMoveList += board[tosq].isPieceOf( side2move ^ 1 );
  pMoveNode  = pMoveNode->pNext[ board[tosq].isPiece() ];
} while ( pMoveNode );

/* none Captures */
pMoveNode = slidingMoveHead[slidingPiece_kind_012][fromSq];
do {
  tosq = pMoveNode->tosq;
  *pMoveList = pMoveNode->move;
  pMoveList += board[tosq].isPiece() ^ 1; /* isEmpty */
  pMoveNode  = pMoveNode->pNext[ board[tosq].isPiece() ];
} while ( pMoveNode );

However, for capture generation specially in the quiescence search, if one imagines a queen in the late ending, the overhead to visit up to 27 empty squares with conditional stores seems not justified, and one may better rely on branches as demonstrated in Fritz Reul's blocking Loop, and to scan potential winning capture targets with a vector attack lookup whether they have lines in common with a sliding piece able to move along that line, to only test whether the squares between are empty - not to mention bitboard techniques.

Dense Index List

As proposed by Matthew R. Brades with 64-bit nodes and indices instead of pointer ^[12], an even denser alternative is to use nodes with two 12-bit indices and target square packed in a 32-bit integer. With appropriate piece-type coding, with bits 2 to 4 masked containing a right shift amount, either 8 for empty squares and 20 for pieces, one may "index" the appropriate next index. Bit 0 and 1 are exclusively set for either white and black pieces.

// U32 as struct SlidingMoveNode {int next1 : 12; int next0 : 12; int tosq  :  8; /* low byte */ };
U32 slidingMoveTable[1456+896+560]; /* queen, rook, bishop ->  2912 * 4 = 11648 bytes  */
U32 slidingMoveHead[3][64];

/* five least significant piece code bits from board: 
   01000 empty
   10101 white piece
   10110 black piece */

node = slidingMoveHead[slidingPiece_kind_012][fromSq];
do {
  node  = slidingMoveTable[node];
  tosq  = node & 0x3f;
  piece = board[tosq];
  *pMoveList = fromaspects | tosq;
  pMoveList += (piece & side2moveBit) == 0;
  node = (node >> (piece & 0x1c)) & 0x0fff;
} while ( node );