Kindergarten Bitboards
Home * Board Representation * Bitboards * Sliding Piece Attacks * Kindergarten Bitboards
Kindergarten bitboards ^{[2]},
was a kind of interactive forum development ^{[3]} with a lot of meanders ^{[4]} . There were two issues involved  first to calculate the occupancy of any line from the occupied bitboard ^{[5]}  and second, compact and dense lookup tables. As a quintessence Gerd Isenberg came up with this nomination. It relies on fast 64bit multiplication, but is otherwise quite resource friendly and a compromise between calculation and tablesize.
Contents
Ranks and Diagonals
Ranks and diagonals  that is their appropriate linemask by squareindex  are first intersected by the occupancy of the whole board. Doesn't matter whether the slider itself is cleared or not  it is redundant anyway, considered by the precalculated lookuptable.
Since there is only up to one bit per file, the northfill multiplication by the Afile maps the diagonal to the 8th rank. Or  since we only need the inner six bits, we combine the required shift left one by multiplying with the Bfile. Shifting right the product by 58 (646) leaves the sixbit occupancyindex in the 0..63 range. For instance the diagonalattacks of a bishop on d4. 'A''H' represent the masked occupied bits along this diagonal, which are either zero or one.
Code samples and bitboard diagrams rely on Little endian file and rank mapping. 
We need 'B''G' as six bit number:
masked line * BFile = BG upper six occupancy 6 bit . . . . . . . H . 1 . . . . . . . A[B C . E F G] . . . . . . . . . . . . . . G . . 1 . . . . . . . A B C . E F G . . . . . . . . . . . . . F . . . 1 . . . . . . . A B C . E F . . . . . . . . . . . . . E . . . . 1 . . . . . . . A B C . E . . >> . . . . . . . . . . . . . . . . * . 1 . . . . . . = . A B C . . . . 58 . . . . . . . . . . C . . . . . . 1 . . . . . . . A B C . . . . . . . . . . . . . B . . . . . . . 1 . . . . . . . A B . . . . . . . . . . . . . A . . . . . . . . 1 . . . . . . . A . . . . . . [B C . E F G]. .
The precalculated lookuptable contains the attacks of the first rank  but eight copies in each rank or byte. It is indexed by the six bit occupiedstate ('B''G') and the file of the slider's square. It needs to be intersected with the same linemask as formerly the occupancy  to map the first rank attack bits to the appropriate line  that's all. Appropriate precalculated attack bits are represented by 'a''h':
8 copies of rank the attack set attacks & lmask > of this line a b c . e f g h . . . . . . . h a b c . e f g h . . . . . . g . a b c . e f g h . . . . . f . . a b c . e f g h . . . . e . . . a b c . e f g h . . . . . . . . a b c . e f g h . . c . . . . . a b c . e f g h . b . . . . . . a b c . e f g h a . . . . . . .
Since all ranks, diagonals and antidiagonals are properly filealigned, it works perfectly with some redundant occupied bits for shorter diagonals as well, like here the outer bit 'B':
masked line * BFile = BG upper six occupancy 6 bit . . . . . . . . . 1 . . . . . . H .[B C . E F G] . . . . . . . . . . . . . . . H . 1 . . . . . . . . B C . E F G . . . . . . . . . . . . . . G . . 1 . . . . . . . . B C . E F G . . . . . . . . . . . . . F . . . 1 . . . . . . . . B C . E F . >> . . . . . . . . . . . . E . . . * . 1 . . . . . . = . . B C . E . . 58 . . . . . . . . . . . . . . . . . 1 . . . . . . . . B C . . . . . . . . . . . . . . C . . . . . . 1 . . . . . . . . B C . . . . . . . . . . . . . B . . . . . . . 1 . . . . . . . . B . . . . . [B C . E F G]. .
Appropriate precalculated attack bits are represented by 'b''h' here:
8 copies of rank the attack set or the attack set attacks & lmask > of this line > of the shorter diagonal a b c . e f g h . . . . . . . h . . . . . . . . a b c . e f g h . . . . . . g . . . . . . . . h a b c . e f g h . . . . . f . . . . . . . . g . a b c . e f g h . . . . e . . . . . . . . f . . a b c . e f g h . . . . . . . . . . . . e . . . a b c . e f g h . . c . . . . . . . . . . . . . a b c . e f g h . b . . . . . . . . c . . . . . a b c . e f g h a . . . . . . . . b . . . . . .
Wasn't that simple? That is why it is called kindergarten bitboards!
The trick is to share one 4KByte table by three linedirections by reusing the mask for a final intersection. Of course one may use the calculated occupied state to index rotated bitboards like tables of 32KByte each. But dividing the table size by 3*8 on the cost of that additional 'and' (and keeping the mask inside a register) is tempting. Of course  like always with computation versus memory issues  it depends on the cache and memory using and footprint inside a particular chess program and the hardware architecture, which solution is preferable. So far L1 Cache is a rare resource, Translation Lookaside Buffer als well.
Like other none rotated approaches, namely magic bitboards or hyperbola quintessence, the nice thing is that one can hide the implementation details behind a stateless interface. In C /C++ one may use header files with exclusive, conditional compiled inlined routines, as combinations and variations of the mentioned approaches.
The three routines only differ by the linemask applied. As pointed out by Aleks Peshkov, it is smarter to index by file, occupancy, since fillUpAttacks[sq&7] may be shared by two (bishop) or three (queen) lineattack getters.
U64 fillUpAttacks[8][64]; // 4 KByte U64 diagonalAttacks(U64 occ, enumSquare sq) { const U64 bFile = C64(0x0202020202020202); occ = (diagonalMaskEx[sq] & occ) * bFile >> 58; return diagonalMaskEx[sq] & fillUpAttacks[sq&7][occ]; } U64 antiDiagAttacks(U64 occ, enumSquare sq) { const U64 bFile = C64(0x0202020202020202); occ = (antidiagMaskEx[sq] & occ) * bFile >> 58; return antidiagMaskEx[sq] & fillUpAttacks[sq&7][occ]; } U64 rankAttacks(U64 occ, enumSquare sq) { const U64 bFile = C64(0x0202020202020202); occ = (rankMaskEx[sq] & occ) * bFile >> 58; return rankMaskEx[sq] & fillUpAttacks[sq&7][occ]; }
One may use similar structs for the linemasks than the hyperbola quintessence.
FileAttacks
Files need tad more work. Shift the board left (arithmetical right!) to the Afile to mask it. To get the inner six bits, a flipmultiplication by the c2h7 diagonal is applied with further shift right 58. The lookuptable contains the Afile attacks, which are shifted "back" to the original file.
U64 aFileAttacks [8][64]; // 4 KByte U64 fileAttacks(U64 occ, enumSquare sq) { const U64 aFile = C64(0x0101010101010101); const U64 diac2h7 = C64(0x0080402010080400); occ = aFile & (occ >> (sq&7)); occ = (diac2h7 * occ ) >> 58; return aFileAttacks[sq>>3][occ] << (sq&7); }
This is how it works:
masked Afile * c2h7 Diagonal = occupancy H . . . . . . . . . . . . . . . . .[G F E D C B] . . . . . . . . G . . . . . . . . . . . . . . 1 . . F E D C B A . . . . . . . . F . . . . . . . . . . . . . 1 . . . E D C B A . . . . . . . . . E . . . . . . . . . . . . 1 . . . . D C B A . . >> . . . . . . . . D . . . . . . . * . . . . 1 . . . = . . C B A . . . 58 . . . . . . . . C . . . . . . . . . . 1 . . . . . . B A . . . . . . . . . . . . B . . . . . . . . . 1 . . . . . . . A . . . . . . . . . . . . . A . . . . . . . . . . . . . . . . . . . . . . . [G F E D C B]. .
Note that the six inner bit occupancy is reversed  considered in the precalculated aFileAttacks array. This reversed lookup was justified to share first rankattacks by all directions  with a dense lookup of 512 Byte. But the 4KByte tables outperform the additional multiplications and shift of the dense version  and one may alternatively multiply with the flipped diagonal, the c7h2 antidiagonal:
masked Afile * c7h2 AntiDiag = occupancy H . . . . . . . . . . . . . . . . .[B C D E F G] . . . . . . . . G . . . . . . . . . 1 . . . . . . . A B C D E F . . . . . . . . F . . . . . . . . . . 1 . . . . . . . A B C D E . . . . . . . . E . . . . . . . . . . . 1 . . . . . . . A B C D >> . . . . . . . . D . . . . . . . * . . . . . 1 . . = . . . . . A B C 58 . . . . . . . . C . . . . . . . . . . . . . 1 . . . . . . . A B . . . . . . . . B . . . . . . . . . . . . . . 1 . . . . . . . A . . . . . . . . A . . . . . . . . . . . . . . . . . . . . . . . [B C D E F G]. .
As often, computation versus memory size. One may share a 512Byte Lookup of the first rank by all lines with some trailing computation. Multiplying with the Afile (fill north) for ranks and diagonals, and multiplying with the diagonal for the file. Likely the additional multiplication don't pays off.
const BYTE firstRankAttacks[8][64]; U64 fileAttacks(U64 occ, enumSquare sq) { const U64 aFile = C64(0x0101010101010101); const U64 hFile = C64(0x8080808080808080); const U64 diaa1h8 = C64(0x8040201008040201); const U64 diac2h7 = C64(0x0080402010080400); unsigned int f = sq & 7; occ = aFile & (occ >> f); occ = ( diac2h7 * occ ) >> 58; occ = diaa1h8 * firstRankAttacks[(sq^56)>>3][occ]; return ( hFile & occ ) >> (f^7); } U64 diagonalAttacks(U64 occ, enumSquare sq) { const U64 aFile = C64(0x0101010101010101); const U64 bFile = C64(0x0202020202020202); unsigned int f = sq & 7; occ = diagonalMaskEx[sq] & occ; occ = (bFile * occ ) >> 58; occ = aFile * firstRankAttacks[f][occ]; return diagonalMaskEx[sq] & occ; }
32bit Versions
One other variation of the memory versus computation theme was encouraged by 32bit mode. 64bit multiplication is quite expensive in 32bit mode  a call using three imuls. Thus, it is more efficient to use shiftor plus 32bit multiplication, which might in fact be used in 64bit mode as well. Piotr Cichy proposed a multiplication less parallel prefix shift approach similar to Occupancy of any Line ^{[6]} , which is a good alternative for processors with slow multiplication.
An efficient and tricky fileapproach was introduced by Zach Wegner ^{[7]}, using a 32KByte, rotated like lookuptable: It is quite strange, yes, but it is an out of order mapping. There are only 5 bits because each bit in the factor maps more than one bit. The trick here is the odd shift 29, so that the multiply does not overflow individual bits. I have since found that 25 and 27 will work with the same magic:
occ . . . . . . . . a . . . . . . . b . . . . . . . occ  occ >> 29 * 0x01041041 with the index bracketed c . . . . . . . ...\ ...\ ...\ d . . . . . . . d . . . . . . . 1 . . . . . . . d a[f c e b d a] e . . . . . . . e . . a . . . . . . 1 . . . . . e b . a f c e b f . . . . . . . f . . b . . . . . . . . 1 . . . f c . b . . f c . . . . . . . . . . . c . . . . 1 . . . . . 1 . . . . c . . . .
The interesting thing is that this works for any masked file. In fact if it was shifted to the afile, you could get away with the 3bit factor 0x00041040 (but using a shift of 23).
U64 arrFileAttacks[64][64]; // [sq][occ64] 32KByte U64 fileAttacks(U64 occ, enumSquare sq) { occ &= fileMask[sq]; U32 fold = (U32)occ  (U32)(occ >> 29); U32 occ64 = fold * 0x01041041 >> 26; return arrFileAttacks[sq][occ64]; }
Ranks and diagonals are trivial, this version favors rotated like memory size for less computation and same operations than fileattacks. One may therefor generalize the routine by a linedirection parameter:
U64 arrDiagonalAttacks[64][64]; // [sq][occ64] 32KByte U64 diagonalAttacks(U64 occ, enumSquare sq) { occ &= diagonalMaskEx[sq]; U32 fold = (U32)occ  (U32)(occ >> 32); U32 occ64 = fold * 0x02020202 >> 26; return arrDiagonalAttacks[sq][occ64]; }
A similar approach was proposed by Andrew Fan in 2009, been active in his own engine for a few years (2006 earliest recorded file time) ^{[8]}.
Magic Compression
So far Kindergarten bitboards performs a perfect hashing of the up to six relevant and scattered occupied bits of any line to a sixbit index  which is a bijective mapping of 64 different occupancies per line to 64 indices for the precalculated attack sets.
If we have a closer look to the attack sets, say of a rook on the afile, we enumerate far less disjoint sets. A rook on a1 (a8) has seven different attacksets on that file, depending on the occupancy of a2a7. On a2 (a7) there is even one attack set less, on a3 (a6) 2 times 5 and on a4 (a5) 3 times 4 attacksets. Thus, there are {7, 6, 10, 12, 12, 10, 6, 7} disjoint attacksets per square on line, or 70 in total over all eight squares.
While kindergarten bitboards apply a minimal perfect mapping of scattered bits to a sixbit index, the mapping of the attacksets is surjective, since each of the 64 occupancies maps only up to 12 distinct sets. Of course that is because occupancies "behind" the first blocker are redundant and map the same attack.
Grant Osborne came up with the idea, derived from magic bitboards  to use different "magic" factors per square (rank), where multiplication may produce carries and enough so called constructive collisions to gain only five or even four bit indices and therefor denser tables. Since different squares may have different table sizes (16 or 32 entries), a Javalike array is used for the attacks, in C implemented as array of pointers to the arbitrary sized attack tables. The variable right shift by either 60 or 59 is encoded inside the otherwise redundant upper six bits of the magic factor, as mentioned in incorporating the shift of magic bitboards.
Grant's proposal, so far with {5,4,4,5,5,4,4,5} bit ranges for the lookups per square for vertical rook attacks, results in a 1.5 KByte array instead the 4KByte of the initial Kindergarten file attack getter ^{[9]} . Whether the effort of the rankindexed magicfactor plus additional pointer indirection pays off the memory saving is another question, and should be tried inside a concrete chess program with its individual cache and memory footprint.
U64 aFileAttacks[4*32+4*16]; // 1.5KByte U64 aPtrFileAttacks[8]; // points to appropriate aFileAttacks U64 fileMagic[8] = { 0xEFFFA39DB01B23A3, // 5bit 0xF024691A3227FF42, // 4bit 0xF2808817CAD6FF0C, // 4bit see below 0xED6EDFBE467977D5, // 5bit 0xEC87CB0D961EC43A, // 5bit 0xF2FF594E14D8801C, // 4bit 0xF2FF5D69D4E3E7D6, // 4bit 0xEE404B349599FF88 // 5bit }; U64 fileAttacks(U64 occ, enumSquare sq) { unsigned int file = sq & 7; unsigned int rank = sq >> 3; occ = 0x0001010101010100 & (occ >> file); occ = (fileMagic[rank] * occ) >> (fileMagic[rank] >> 58); // four&five bit index return *(aPtrFileAttacks[rank] + occ) << file; }
The table demonstrates how it works for fileattack of the a3 rook with a four bit range only five relevant occupied bits, since a3 is member of the inner six bits. The empirical determined factor is 0xF2808817CAD6FF0C, six upper bits contain the right shift for the product, for this square shift 60:
occupancy (AFile)  product  index 0..15  attack set 

o  outer squares don't care x  empty or any piece 
occupancy *

upper nibble 
1 attacked . not attacked 
o x 
1. attackset  . .  
0x0000010101000100

0x3A28F9D5E2FF0C00

3  0x0000000001000100

0x0001010101000100

0x3934F9D5E2FF0C00

3  0x0000000001000100

0x0000000101000100

0x6329EDD5E2FF0C00

6  0x0000000001000100

0x0000010001000100

0x6F51FAC9E2FF0C00

6  0x0000000001000100

0x0001000101000100

0x6235EDD5E2FF0C00

6  0x0000000001000100

0x0001010001000100

0x6E5DFAC9E2FF0C00

6  0x0000000001000100

0x0000000001000100

0x9852EEC9E2FF0C00

9  0x0000000001000100

0x0001000001000100

0x975EEEC9E2FF0C00

9  0x0000000001000100

o x 
2. attack set  . .  
0x0000000001000000

0x17CAD6FF0C000000

1  0x0000000001000101

0x0001000001000000

0x16D6D6FF0C000000

1  0x0000000001000101

0x0000010101000000

0xB9A0E20B0C000000

11  0x0000000001000101

0x0001010101000000

0xB8ACE20B0C000000

11  0x0000000001000101

0x0000000101000000

0xE2A1D60B0C000000

14  0x0000000001000101

0x0000010001000000

0xEEC9E2FF0C000000

14  0x0000000001000101

0x0001000101000000

0xE1ADD60B0C000000

14  0x0000000001000101

0x0001010001000000

0xEDD5E2FF0C000000

14  0x0000000001000101

o x 
3. attack set  . .  
0x0001010100000100

0x216A22D6D6FF0C00

2  0x0000000101000100

0x0000010100000100

0x225E22D6D6FF0C00

2  0x0000000101000100

0x0000000100000100

0x4B5F16D6D6FF0C00

4  0x0000000101000100

0x0001000100000100

0x4A6B16D6D6FF0C00

4  0x0000000101000100

o x 
4. attack set  . .  
0x0000010100000000

0xA1D60B0C00000000

10  0x0000000101000101

0x0001010100000000

0xA0E20B0C00000000

10  0x0000000101000101

0x0000000100000000

0xCAD6FF0C00000000

12  0x0000000101000101

0x0001000100000000

0xC9E2FF0C00000000

12  0x0000000101000101

o x 
5. attack set  . .  
0x0000010000000100

0x578723CAD6FF0C00

5  0x0000010101000100

0x0001010000000100

0x569323CAD6FF0C00

5  0x0000010101000100

o x 
6. attack set  . .  
0x0000010000000000

0xD6FF0C0000000000

13  0x0000010101000101

0x0001010000000000

0xD60B0C0000000000

13  0x0000010101000101

o b 
7. attack set  . 1  
0x0001000000000100

0x7F9417CAD6FF0C00

7  0x0001010101000100

o b 
8. attack set  . 1  
0x0001000000000000

0xFF0C000000000000

15  0x0001010101000101

o . 
9. attack set  1 1  
0x0000000000000100

0x808817CAD6FF0C00

8  0x0101010101000100

o . 
10. attack set,
no blocker 
1 1  
0x0000000000000000

0x0000000000000000

0  0x0101010101000101

See also
 Flipping, Mirroring and Rotating of Rank, File and Diagonal
 Magic Bitboards
 Occupancy of any Line
 Rotated Bitboards
 Sliding Piece Attacks in OliThink
 Sliding Piece Attacks in Winglet
Forum Posts
 Re: Rotated bitboards by Urban Koistinen, rgcc, October 31, 1997 » Collapsed files, Collapsed ranks
 rotated bitboards obsolete? by Gerd Isenberg, CCC, February 26, 2006
 Re: Some thoughts on Dann Corbit's rotated alternative by Steffan Westcott, CCC, March 03, 2006
 Compact Bitboard Attacks by Tom Likens, Winboard Forum, March 14, 2006
 Zach's tricky 32bit approach by Zach Wegner, Winboard Forum, August 22, 2006
 Magic Bitboards Explained! by Michael Sherwin, Winboard Forum, December 04, 2006
 Kindergarten Bitboard Approach by Gerd Isenberg by Edsel Apostol, CCC, November 05, 2008 » OliThink
 Kindergarten bitboards without multiplying by Piotr Cichy, CCC, August 07, 2009
 Kindergarten Bitboard help by Gregory Strong, CCC, November 22, 2009
 32bit Magic experiments by Andrew Fan, Winboard Forum, December 03, 2009 » Magic Bitboards
 Kindergarten bitboards and Xiangqi move genaration? by Han Chengye, CCC, December 30, 2009 » Chinese Chess
External Links
 Kindergarten from Wikipedia
 George Benson  This Masquerade, 1976, YouTube Video
 Pat Metheny and friends  This Masquerade, Jazz Baltica, July 6, 2003, YouTube Video
References
 ↑ Paul Klee  Group of Masks, from the Israel Museum
 ↑ Magic Bitboards Explained! by Michael Sherwin and reply by Gerd Isenberg to call it Kindergarten Bitboards, Winboard Forum, December 4, 2006
 ↑ Compact Bitboard Attacks by Tom Likens, Winboard Forum, March 14, 2006
 ↑ rotated bitboards obsolete? by Gerd Isenberg, CCC, February 26, 2006
 ↑ Re: Some thoughts on Dann Corbit's rotated alternative by Steffan Westcott, CCC, March 03, 2006
 ↑ Kindergarten bitboards without multiplying by Piotr Cichy, CCC, August 07, 2009
 ↑ Zach's tricky 32bit approach by Zach Wegner, Winboard Forum, August 22, 2006
 ↑ 32bit Magic experiments by Andrew Fan, Winboard Forum, December 03, 2009
 ↑ Re: How to reduce the "bits" used in a magic number by Grant Osborne, CCC, July 04, 2008