# Sherwin Bitboards

Home * Board Representation * Bitboards * Sliding Piece Attacks * Sherwin Bitboards

The Sherwin bitboards were created by Michael Sherwin with some help from other people in CCC . The idea is to generate the attacks of a sliding piece by collecting all occupied squares on its rays for each row. Then use a first lookup table to get an index into a second table that contains the attack bitboards. Most of the hard work is done in the initialization of the tables. Care must be taken with the right order and enumeration of the table indizes. The code below is taken from the chess engine Elephant and it uses the same big-endian file-mapping as the exploding bitboards.

# Bishops

These are the relevant squares for a bishop on d4. We don't have to look at pieces at the border of the board to decide whether their squares can be attacked or defended.

``` . . . . . . . -
- . . . . . 1 .  giving 1 bit  or 2 different values from row 7
. 1 . . . 1 . .  giving 2 bits or 4 different values from row 6
. . 1 . 1 . . .  giving 2 bits or 4 different values from row 5
. . . B . . . .  giving 0 bits or 0 different values from row 4
. . 1 . 1 . . .  giving 2 bits or 4 different values from row 3
. 1 . . . 1 . .  giving 2 bits or 4 different values from row 2
- . . . . . - .
```

We need an helper array and an initializing function for the rays. This mask covers the inner 6x6 board only.

```Bitboard bishopBits;
void initBishopBits()
{
int sq;
for ( sq = 0; sq < 64; ++sq )
{
bishopBits[sq] = 0;
int i;
for ( i = sq - 9; i >= 0 && i % 8 != 7; i -= 9 )
bishopBits[sq] |= C64(1) << i;
for ( i = sq - 7; i >= 0 && i % 8 != 0; i -= 7 )
bishopBits[sq] |= C64(1) << i;
for ( i = sq + 9; i < 64 && i % 8 != 0; i += 9 )
bishopBits[sq] |= C64(1) << i;
for ( i = sq + 7; i < 64 && i % 8 != 7; i += 7 )
bishopBits[sq] |= C64(1) << i;
bishopBits[sq] &= C64(0x007e7e7e7e7e7e00);
}
}
```

With the help of a first table we condense the scattered bits to a compact index number. For bits named a to i the value for Bd4 will be:

``` . . . . . . . -
- . . . . . i .  giving 1 bit  or 2 different values from row 7
. g . . . h . .  giving 2 bits or 4 different values from row 6
. . e . f . . .  giving 2 bits or 4 different values from row 5
. . . B . . . .  giving 0 bits or 0 different values from row 4
. . c . d . . .  giving 2 bits or 4 different values from row 3
. a . . . b . .  giving 2 bits or 4 different values from row 2
- . . . . . - .

-> 0x0000000ihgfedcba as index. With 2^9 = 512 index values.

```

There are 4 squares with 9 bits like Bd4. Other squares need another amount of bits.

```const byte squareBitsB =
{
6, 5, 5, 5, 5, 5, 5, 6,
5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 7, 7, 7, 7, 5, 5,
5, 5, 7, 9, 9, 7, 5, 5,
5, 5, 7, 9, 9, 7, 5, 5,
5, 5, 7, 7, 7, 7, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5,
6, 5, 5, 5, 5, 5, 5, 6,
};
```

Here is how much values of the first table we need. That is also the size of the second table.

``` 4 squares with 9 bits for 512 indices are 2048 indices.
12 squares with 7 bits for 128 indices are 1536 indices.
44 squares with 5 bits for  32 indices are 1408 indices.
4 squares with 6 bits for  64 indices are  256 indices.
Total 5248 indices.
```

This is the layout of the first big table with index-bits for each square and all rows. The indices are in this order:

```0000999999999 4 times 9 bit
0001999999999
0010999999999
0011999999999
0100007777777 12 times 7 bit
01....7777777
0110117777777
0111000666666 4 times 6 bit
0111001666666
0111010666666
0111011666666
0111100055555 44 times 5 bit
........55555
1010001155555
```

And here is the initializing function.

```short int bishopRows;
void initBishopRows()
{
int baseIndex = 0;
for ( int bits = 9; bits >= 5; --bits )
{
for ( int sq = 0; sq < 64; ++sq )
{
if ( squareBitsB[sq] != bits )
continue;
Bitboard bb = bishopBits[sq];
bb >>= 9;
int shift = 0;
for ( int row = 0; row < 6; ++row )
{
int p = (bb >> (row * 8)) & 0x3f;
for ( int pattern = 0; pattern < 64; ++pattern )
{
int index = 0;
int s = shift;
for ( int i = 0; i < 6; ++i )
{
if ( p & (1 << i) )
{
index |= ( (pattern & (1 << i)) ? (1 << s) : 0 );
s++;
if ( pattern == 63 )
shift++;
}
}
bishopRows[sq][row][pattern] = baseIndex + index;
}
}
baseIndex += (1 << bits);
}
}
}
```

A second table with 5248 entries can hold all bishop attack bitboards. This table must also be initialized.

```Bitboard bishopAttackTable;
void initBishopAttacks()
{
int baseIndex = 0;
for ( int bits = 9; bits >= 5; --bits )
{
for ( int sq = 0; sq < 64; ++sq )
{
if ( squareBitsB[sq] != bits )
continue;
Bitboard bb = bishopBits[sq];
for ( int index = 0; index < (1 << bits); ++ index )
{
Bitboard occ = 0;
int i = index;
for ( int rsq = 0; rsq < 64; ++rsq )
{
if ( bb.test_bit( rsq ) )
{
if ( i & 1 )
occ.set_bit( rsq );
i >>= 1;
}
}
Bitboard att = 0;
int j;
for ( j = sq + 9; j < 64 && (j & 7) != 0; j += 9 )
{
att.set_bit( j );
if ( occ.test_bit( j ) )
break;
}
for ( j = sq + 7; j < 64 && (j & 7) != 7; j += 7 )
{
att.set_bit( j );
if ( occ.test_bit( j ) )
break;
}
for ( j = sq - 9; j >= 0 && (j & 7) != 7; j -= 9 )
{
att.set_bit( j );
if ( occ.test_bit( j ) )
break;
}
for ( j = sq - 7; j >= 0 && (j & 7) != 0; j -= 7 )
{
att.set_bit( j );
if ( occ.test_bit( j ) )
break;
}
bishopAttackTable[baseIndex + index] = att;
}
baseIndex += (1 << bits);
}
}
}

```

Ok, here comes the function to get the bishop attack bitboard for a square and occupied bitboard.

```Bitboard bishopAttacks( int sq, Bitboard occ )
{
// The remaining blocking pieces in the X-rays
occ  &= bishopBits[sq];
occ >>= 9;
// Since every square has its set of row values the six row lookups
// simply map any blockers to specific bits that when ored together
// gives an offset in the bishop attack table.
short int *bRows = &bishopRows[sq];
int index = (bRows +   0)[(occ >>  0) & 0x3f]  // row 2
| (bRows +  64)[(occ >>  8) & 0x3f]  // row 3
| (bRows + 128)[(occ >> 16) & 0x3f]  // row 4
| (bRows + 192)[(occ >> 24) & 0x3f]  // row 5
| (bRows + 256)[(occ >> 32) & 0x3f]  // row 6
| (bRows + 320)[(occ >> 40) & 0x3f]; // row 7
return bishopAttackTable[index];
}
```

Perhaps you should look at this function first to understand the algorithm.

After the creation of the lookup tables which is only done once the often used bishopAttacks() function is rather easy and performant. And it is branchless.

There is a possible modification of this function that uses unions rather than >> and &. Some people believe that is faster. But it depends on big and little endianess.

```#if BIG_ENDIAN() && !LITTLE_ENDIAN()
// Big end first means row 8 is first byte with my square encoding.
struct BBBytes
{
unsigned char row8;
unsigned char row7;
unsigned char row6;
unsigned char row5;
unsigned char row4;
unsigned char row3;
unsigned char row2;
unsigned char row1;
};
#elif LITTLE_ENDIAN() && !BIG_ENDIAN()
// Little end first means row 1 is first byte with my square encoding.
struct BBBytes
{
unsigned char row1;
unsigned char row2;
unsigned char row3;
unsigned char row4;
unsigned char row5;
unsigned char row6;
unsigned char row7;
unsigned char row8;
};
#else
#error big little endian
#endif

union BBUnion
{
Bits64   bb;    // typedef uint64 Bits64;
BBBytes  bbb;
};

Bitboard bishopAttacks( int sq, Bitboard occ )
{
// The remaining blocking pieces in the X-rays
occ  &= bishopBits[sq];
occ >>= 1;
BBUnion occu;
occu.bb = occ;
// Since every square has its set of row values the six row lookups
// simply map any blockers to specific bits that when ored together
// gives an offset in the bishop attack table.
short int *bRows = &bishopRows[sq];
int index = (bRows +   0)[occu.bbb.row2]  // row 2
| (bRows +  64)[occu.bbb.row3]  // row 3
| (bRows + 128)[occu.bbb.row4]  // row 4
| (bRows + 192)[occu.bbb.row5]  // row 5
| (bRows + 256)[occu.bbb.row6]  // row 6
| (bRows + 320)[occu.bbb.row7]; // row 7
return bishopAttackTable[index];
}
```

# Rooks

Rooks are mostly treated like the bishops. It starts with their rays.

```Bitboard rookBits;
void initRookBits()
{
int sq;
for ( sq = 0; sq < 64; ++sq )
{
rookBits[sq] = 0;
int i;
for ( i = sq - 1; i >= 0 && i % 8 != 7; --i )
rookBits[sq] |= C64(1) << i;
for ( i = sq - 8; i >= 0; i -= 8 )
rookBits[sq] |= C64(1) << i;
for ( i = sq + 1; i < 64 && i % 8 != 0; ++i )
rookBits[sq] |= C64(1) << i;
for ( i = sq + 8; i < 64; i += 8 )
rookBits[sq] |= C64(1) << i;
if ( (sq & 7) != 7 )
rookBits[sq] &= C64(0x7f7f7f7f7f7f7f7f);
if ( (sq & 7) != 0 )
rookBits[sq] &= C64(0xfefefefefefefefe);
if ( (sq / 8) != 7 )
rookBits[sq] &= C64(0x00ffffffffffffff);
if ( (sq / 8) != 0 )
rookBits[sq] &= C64(0xffffffffffffff00);
}
}

```

There are 36 squares with 10 bits like Rd4. Other squares need another amount of bits.

```const byte squareBitsR =
{
12, 11, 11, 11, 11, 11, 11, 12,
11, 10, 10, 10, 10, 10, 10, 11,
11, 10, 10, 10, 10, 10, 10, 11,
11, 10, 10, 10, 10, 10, 10, 11,
11, 10, 10, 10, 10, 10, 10, 11,
11, 10, 10, 10, 10, 10, 10, 11,
11, 10, 10, 10, 10, 10, 10, 11,
12, 11, 11, 11, 11, 11, 11, 12,
};
```

This is the layout of the first big table with index-bits for each square and all rows. The indices are in this order:

```00000cccccccccccc 4 times 12 bit
00001cccccccccccc
00010cccccccccccc
00011cccccccccccc
001000bbbbbbbbbbb 24 times 11 bit
0.....bbbbbbbbbbb
100111bbbbbbbbbbb
1010000aaaaaaaaaa 36 times 10 bit
.......aaaaaaaaaa
1110011aaaaaaaaaa
```

For rooks we need a bigger second table. It has 102400 entries. This makes even the first table bigger because it can not use 2 byte short integer values. And the last dimension of the first table has to cover all 8 bits of a row. This is because rooks on file a or h have rays along the outside files.

Here is the initializing function.

```int rookRows;
void initRookRows()
{
int baseIndex = 0;
for ( int bits = 12; bits >= 10; --bits )
{
for ( int sq = 0; sq < 64; ++sq )
{
if ( squareBitsR[sq] != bits )
continue;
Bitboard bb = rookBits[sq];
int shift = 0;
for ( int row = 0; row < 8; ++row )
{
int p = (bb >> (row * 8)) & 0xff;
for ( int pattern = 0; pattern < 256; ++pattern )
{
int index = 0;
int s = shift;
for ( int i = 0; i < 8; ++i )
{
if ( p & (1 << i) )
{
index |= ( (pattern & (1 << i)) ? (1 << s) : 0 );
s++;
if ( pattern == 255 )
shift++;
}
}
rookRows[sq][row][pattern] = baseIndex + index;
//logf << "rookRows " << sq << " " << row << " " << pattern << " : ";
//logf << rookRows[sq][row][pattern] << endl;
}
}
baseIndex += (1 << bits);
}
}
}
```

A second table with 102400 entries can hold all rook attack bitboards. This table must also be initialized.

```Bitboard rookAttackTable;
void initRookAttacks()
{
int baseIndex = 0;
for ( int bits = 12; bits >= 10; --bits )
{
for ( int sq = 0; sq < 64; ++sq )
{
if ( squareBitsR[sq] != bits )
continue;
Bitboard bb = rookBits[sq];
for ( int index = 0; index < (1 << bits); ++ index )
{
Bitboard occ = 0;
int i = index;
for ( int rsq = 0; rsq < 64; ++rsq )
{
if ( bb.test_bit( rsq ) )
{
if ( i & 1 )
occ.set_bit( rsq );
i >>= 1;
}
}
Bitboard att = 0;
int j;
for ( j = sq + 1; j < 64 && (j & 7) != 0; ++j )
{
att.set_bit( j );
if ( occ.test_bit( j ) )
break;
}
for ( j = sq + 8; j < 64; j += 8 )
{
att.set_bit( j );
if ( occ.test_bit( j ) )
break;
}
for ( j = sq - 1; j >= 0 && (j & 7) != 7; --j )
{
att.set_bit( j );
if ( occ.test_bit( j ) )
break;
}
for ( j = sq - 8; j >= 0; j -= 8 )
{
att.set_bit( j );
if ( occ.test_bit( j ) )
break;
}
rookAttackTable[baseIndex + index] = att;
}
baseIndex += (1 << bits);
}
}
}

```

Ok, here comes the function to get the bishop attack bitboard for a square and occupied bitboard.

```Bitboard rookAttacks( int sq, Bitboard occ )
{
// The remaining blocking pieces in the +-rays
occ &= rookBits[sq];
// Since every square has its set of row values the six row lookups
// simply map any blockers to specific bits that when ored together
// gives an offset in the bishop attack table.
int *rRows = &rookRows[sq];
int index = (rRows +    0)[(occ >>  0) & 0xff]  // row 1
| (rRows +  256)[(occ >>  8) & 0xff]  // row 2
| (rRows +  512)[(occ >> 16) & 0xff]  // row 3
| (rRows +  768)[(occ >> 24) & 0xff]  // row 4
| (rRows + 1024)[(occ >> 32) & 0xff]  // row 5
| (rRows + 1280)[(occ >> 40) & 0xff]  // row 6
| (rRows + 1536)[(occ >> 48) & 0xff]  // row 7
| (rRows + 1792)[(occ >> 56) & 0xff]; // row 8
return rookAttackTable[index];
}

```

Perhaps you should look at this function first to understand the algorithm.

After the creation of the lookup tables which is only done once the often used bishopAttacks() function is rather easy and performant. And it is branchless.

Again there is the option to use unions.

```Bitboard rookAttacks( int sq, Bitboard occ )
{
// The remaining blocking pieces in the +-rays
occ &= rookBits[sq];
BBUnion occu;
occu.bb = occ;
// Since every square has its set of row values the six row lookups
// simply map any blockers to specific bits that when ored together
// gives an offset in the bishop attack table.
int *rRows = &rookRows[sq];
int index = (rRows +    0)[occu.bbb.row1]  // row 1
| (rRows +  256)[occu.bbb.row2]  // row 2
| (rRows +  512)[occu.bbb.row3]  // row 3
| (rRows +  768)[occu.bbb.row4]  // row 4
| (rRows + 1024)[occu.bbb.row5]  // row 5
| (rRows + 1280)[occu.bbb.row6]  // row 6
| (rRows + 1536)[occu.bbb.row7]  // row 7
| (rRows + 1792)[occu.bbb.row8]; // row 8
return rookAttackTable[index];
}
```

# Results

The results of the functions bishopAttacks() and rookAttacks() can be used in the same way as described in exploding bitboards.