Changes

Jump to: navigation, search

Kogge-Stone Algorithm

13,068 bytes added, 21:05, 9 May 2018
Created page with "'''Home * Board Representation * Bitboards * Sliding Piece Attacks * Kogge-Stone Algorithm''' FILE:4 bit Kogge Stone Adder Example new.png|border|..."
'''[[Main Page|Home]] * [[Board Representation]] * [[Bitboards]] * [[Sliding Piece Attacks]] * Kogge-Stone Algorithm'''

[[FILE:4 bit Kogge Stone Adder Example new.png|border|right|thumb|4-bit Kogge-Stone adder <ref>[https://en.wikipedia.org/wiki/Kogge-Stone_adder Kogge-Stone adder from Wikipedia]</ref> ]]

The '''Kogge-Stone Algorithm''' for set-wise [[Sliding Pieces|sliding piece]] attack generation was first introduced by [[Steffan Westcott]] in [[CCC]] <ref>[https://www.stmintz.com/ccc/index.php?id=252289 Re: flood fill attack bitboards] by [[Steffan Westcott]], [[CCC]], September 15, 2002</ref> . It is a [[Parallel Prefix Algorithms|parallel prefix]] approach of a occluded [[Dumb7Fill|dumb7]] [https://en.wikipedia.org/wiki/Flood_fill flood-fill], propagating sliding piece attacks like [https://en.wikipedia.org/wiki/Carry_(arithmetic) carries] of a [https://en.wikipedia.org/wiki/Kogge-Stone_adder Kogge-Stone] [[Combinatorial Logic#Adder|hardware adder]] <ref>[http://www.aoki.ecei.tohoku.ac.jp/arith/mg/algorithm.html#fsa_pfx Hardware algorithms for arithmetic modules] from the ARITH research group, Aoki lab., [https://en.wikipedia.org/wiki/Tohoku_University Tohoku University]</ref> in [[Parallel Prefix Algorithms#KoggeStoneAdder|software]]. One needs to pass sliding pieces as generator set "g" and the set of empty squares as propagator set "p". For appropriate attacks we need to shift the occluded fill [[General Setwise Operations#OneStepOnly|one step]] further, considering wraps.
{{MappingHint}}
=Parallel Prefix=
The routine fillUpOccluded() smears the set bits of bitboard g upwards, but only along set bits of p; a reset bit in p is enough to halt a smear. Other routines have a similar effect.
<pre>
U64 fillUpOccluded(U64 g, U64 p) {
g |= p & (g << 8);
p &= (p << 8);
g |= p & (g << 16);
p &= (p << 16);
g |= p & (g << 32);
return g;
}
</pre>
The method chosen in FillUpOccluded() is based on a Kogge-Stone parallel prefix network, because it can be implemented very easily in software. The diagram below (trust me, it really _is_ supposed to look like that) is an illustration of how it works. The corresponding lines of program code are shown on the right. The inputs are fed into the network at the top, pass along the connecting lines, are combined by the # operator at various points, and the outputs appear at the bottom.
<pre>
x1 x2 x3 x4 x5 x6 x7 x8 Input : g, p
| | | | | | | |
V V V V V V V V
| | | | | | | |
| | | | | | | |
|\ |\ |\ |\ |\ |\ |\ |
| \| \| \| \| \| \| \|
| # # # # # # # g |= p & (g << 8);
| | | | | | | | p &= (p << 8);
|\ |\ |\ |\ |\ |\ | |
| \: \: \: \: \: \: |
| \ \ \ \ \ \ |
| :\ :\ :\ :\ :\ :\ |
| | \| \| \| \| \| \|
| | # # # # # # g |= p & (g << 16);
| | | | | | | | p &= (p << 16);
|\ |\ |\ |\ | | | |
| \: \: \: \: | | |
| \ \ \ \ | | |
| :\ :\ :\ :\ | | |
| | \: \: \: \: | |
| | \ \ \ \ | |
| | :\ :\ :\ :\ | |
| | | \: \: \: \: |
| | | \ \ \ \ |
| | | ;\ ;\ :\ :\ |
| | | | \| \| \| \|
| | | | # # # # g |= p & (g << 32);
| | | | | | | |
| | | | | | | |
V V V V V V V V
q1 q2 q3 q4 q5 q6 q7 q8 Output : g
</pre>

=Direction-wise Fill=
''We further rely on the [https://en.wikipedia.org/wiki/Compass_rose compass rose] to identify [[Direction|ray-directions]].''
<pre>
noWe nort noEa
+7 +8 +9
\ | /
west -1 <- 0 -> +1 east
/ | \
-9 -8 -7
soWe sout soEa
</pre>
Assuming [[Square Mapping Considerations#LittleEndianRankFileMapping|little-endian]] file mapping. Big-endian file mapping has to swap A and H and east and west. As a reminder - [[General Setwise Operations#ShiftingBitboards|shifting bitboards]] - the base of further stuff.

<span id="Fillonanemptyboard"></span>
==Fill on an empty Board==
We already used the south- and north- [[Parallel Prefix Algorithms|parallel prefix]] [[Pawn Fills|fill-routines]] while calculating [[Pawn Spans|pawn spans]]. We need one additional direction step for ray-attacks on the otherwise empty board to exclude sliders. Convenient to initialize [[On an empty Board#RayAttacks|ray-attacks]] [[Array|arrays]] of single sliders. We may conduct those routines from the general occluded fills by passing the universal propagator set. The vertical fills already look familiar.
<pre>
U64 soutFill(U64 gen) {
gen |= (gen >> 8);
gen |= (gen >> 16);
gen |= (gen >> 32);
return gen;
}

U64 nortFill(U64 gen) {
gen |= (gen << 8);
gen |= (gen << 16);
gen |= (gen << 32);
return gen;
}
</pre>
Using explicit propagators as compile time constants to manage the A-, H-file-wraps.
<pre>
U64 eastFill(U64 gen) {
const U64 pr0 = notAFile;
const U64 pr1 = pr0 & (pr0 << 1);
const U64 pr2 = pr1 & (pr1 << 2);
gen |= pr0 & (gen << 1);
gen |= pr1 & (gen << 2);
gen |= pr2 & (gen << 4);
return gen;
}

U64 noEaFill(U64 gen) {
const U64 pr0 = notAFile;
const U64 pr1 = pr0 & (pr0 << 9);
const U64 pr2 = pr1 & (pr1 << 18);
gen |= pr0 & (gen << 9);
gen |= pr1 & (gen << 18);
gen |= pr2 & (gen << 36);
return gen;
}

U64 soEaFill(U64 gen) {
const U64 pr0 = notAFile;
const U64 pr1 = pr0 & (pr0 >> 7);
const U64 pr2 = pr1 & (pr1 >> 14);
gen |= pr0 & (gen >> 7);
gen |= pr1 & (gen >> 14);
gen |= pr2 & (gen >> 28);
return gen;
}

U64 westFill(U64 gen) {
const U64 pr0 = notHFile;
const U64 pr1 = pr0 & (pr0 >> 1);
const U64 pr2 = pr1 & (pr1 >> 2);
gen |= pr0 & (gen >> 1);
gen |= pr1 & (gen >> 2);
gen |= pr2 & (gen >> 4);
return gen;
}

U64 soWeFill(U64 gen) {
const U64 pr0 = notHFile;
const U64 pr1 = pr0 & (pr0 >> 9);
const U64 pr2 = pr1 & (pr1 >> 18);
gen |= pr0 & (gen >> 9);
gen |= pr1 & (gen >> 18);
gen |= pr2 & (gen >> 36);
return gen;
}

U64 noWeFill(U64 gen) {
const U64 pr0 = notHFile;
const U64 pr1 = pr0 & (pr0 << 7);
const U64 pr2 = pr1 & (pr1 << 14);
gen |= pr0 & (gen << 7);
gen |= pr1 & (gen << 14);
gen |= pr2 & (gen << 28);
return gen;
}
</pre>
<span id="OccludedFill"></span>
==Occluded Fill==
Occluded fills include sliders, but exclude blockers <ref>[http://www.talkchess.com/forum/viewtopic.php?t=22038 Kogge Stone Algorithm mistake on chessprogramming Wiki] by [[Christopher Conkie]], [[CCC]], June 29, 2008</ref>.
<pre>
U64 soutOccl(U64 gen, U64 pro) {
gen |= pro & (gen >> 8);
pro &= (pro >> 8);
gen |= pro & (gen >> 16);
pro &= (pro >> 16);
gen |= pro & (gen >> 32);
return gen;
}

U64 nortOccl(U64 gen, U64 pro) {
gen |= pro & (gen << 8);
pro &= (pro << 8);
gen |= pro & (gen << 16);
pro &= (pro << 16);
gen |= pro & (gen << 32);
return gen;
}

U64 eastOccl(U64 gen, U64 pro) {
pro &= notAFile;
gen |= pro & (gen << 1);
pro &= (pro << 1);
gen |= pro & (gen << 2);
pro &= (pro << 2);
gen |= pro & (gen << 4);
return gen;
}

U64 noEaOccl(U64 gen, U64 pro) {
pro &= notAFile;
gen |= pro & (gen << 9);
pro &= (pro << 9);
gen |= pro & (gen << 18);
pro &= (pro << 18);
gen |= pro & (gen << 36);
return gen;
}

U64 soEaOccl(U64 gen, U64 pro) {
pro &= notAFile;
gen |= pro & (gen >> 7);
pro &= (pro >> 7);
gen |= pro & (gen >> 14);
pro &= (pro >> 14);
gen |= pro & (gen >> 28);
return gen;
}

U64 westOccl(U64 gen, U64 pro) {
pro &= notHFile;
gen |= pro & (gen >> 1);
pro &= (pro >> 1);
gen |= pro & (gen >> 2);
pro &= (pro >> 2);
gen |= pro & (gen >> 4);
return gen;
}

U64 soWeOccl(U64 gen, U64 pro) {
pro &= notHFile;
gen |= pro & (gen >> 9);
pro &= (pro >> 9);
gen |= pro & (gen >> 18);
pro &= (pro >> 18);
gen |= pro & (gen >> 36);
return gen;
}

U64 noWeOccl(U64 gen, U64 pro) {
pro &= notHFile;
gen |= pro & (gen << 7);
pro &= (pro << 7);
gen |= pro & (gen << 14);
pro &= (pro << 14);
gen |= pro & (gen << 28);
return gen;
}
</pre>

==Ray-wise Attacks==
Ray-wise attacks need the occluded fills [[General Setwise Operations#OneStepOnly|shift one]] further, considering wraps.
<pre>
U64 soutAttacks (U64 rooks, U64 empty) {return soutOne(soutOccl(rooks, empty));}
U64 nortAttacks (U64 rooks, U64 empty) {return nortOne(nortOccl(rooks, empty));}
U64 eastAttacks (U64 rooks, U64 empty) {return eastOne(eastOccl(rooks, empty));}
U64 noEaAttacks (U64 bishops, U64 empty) {return noEaOne(noEaOccl(bishops, empty));}
U64 soEaAttacks (U64 bishops, U64 empty) {return soEaOne(soEaOccl(bishops, empty));}
U64 westAttacks (U64 rooks, U64 empty) {return westOne(westOccl(rooks, empty));}
U64 soWeAttacks (U64 bishops, U64 empty) {return soWeOne(soWeOccl(bishops, empty));}
U64 noWeAttacks (U64 bishops, U64 empty) {return noWeOne(noWeOccl(bishops, empty));}
</pre>

<span id="GeneralizedRays"></span>
=Generalized Rays=
Since [[General Setwise Operations#Rotate|rotate 64]] works like a [[General Setwise Operations#GeneralizedShift|generalized shift]] with positive or negative shift amount, it might be applied to get pawn-attacks for both sides - or a Kogge-Stone fill with a direction parameter and small lookups for shift amount and wrap ands, instead of multiple code for eight directions. Of course generalized shift will be a bit slower due to lookups and using cl as the shift amount register.

<pre>
U64 occludedFill (U64 gen, U64 pro, int dir8)
{
int r = shift[dir8]; // {+-1,7,8,9}
pro &= avoidWrap[dir8];
gen |= pro & rotateLeft(gen, r);
pro &= rotateLeft(pro, r);
gen |= pro & rotateLeft(gen, 2*r);
pro &= rotateLeft(pro, 2*r);
gen |= pro & rotateLeft(gen, 4*r);
return gen;
}

U64 shiftOne (U64 b, int dir8)
{
int r = shift[dir8]; // {+-1,7,8,9}
return rotateLeft(b, r) & avoidWrap[dir8];
}

U64 slidingAttacks (U64 sliders, U64 empty, int dir8)
{
U64 fill = occludedFill(slider, empty, dir8)
return shiftOne(fill, dir8);
}

// positve left, negative right shifts
int shift[8] = {9, 1,-7,-8,-9,-1, 7, 8};

U64 avoidWrap[8] =
{
0xfefefefefefefe00,
0xfefefefefefefefe,
0x00fefefefefefefe,
0x00ffffffffffffff,
0x007f7f7f7f7f7f7f,
0x7f7f7f7f7f7f7f7f,
0x7f7f7f7f7f7f7f00,
0xffffffffffffff00,
};
</pre>

=See also=
* [[Steffan Westcott|Steffan Westcott's]] [[Parallel Prefix Algorithms#FurtherElaborationsOnKoggeStone|Elaboration on Kogge-Stone algorithm]]
* [[Parallel Prefix Algorithms#KoggeStoneAdder|Add/Sub versus Attacks]]
* [[Design Principles]] by [[Steffan Westcott]]
* [[Dumb7Fill]]
* [[Fill by Subtraction]]
* [[Fill Algorithms]]
* [[Pieces versus Directions]]
* [[SSE2#SSE2WrapperinCpp|SSE2-Wrapper]] in [[Cpp|C++]]

=Publications=
* [http://www.cse.nd.edu/%7Ekogge/ Peter M. Kogge], [[Mathematician#HSStone|Harold S. Stone]] ('''1973'''). ''A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations''. IEEE Transactions on Computers, 1973, C-22, 783-791, [http://bwrc.eecs.berkeley.edu/Classes/ee225c/2000%20225c/Papers/KoggeStone.pdf pdf]

=Forum Posts=
* [https://www.stmintz.com/ccc/index.php?id=252289 Re: flood fill attack bitboards] by [[Steffan Westcott]], [[CCC]], September 15, 2002
* [http://www.talkchess.com/forum/viewtopic.php?t=22038 Kogge Stone Algorithm mistake on chessprogramming Wiki] by [[Christopher Conkie]], [[CCC]], June 29, 2008
* [http://www.talkchess.com/forum/viewtopic.php?start=0&t=25979&start=10 Re: Hyperbola Quiesscene: hardly any improvement] by [[Karlo Bala Jr.]], [[CCC]], January 14, 2009 » [[Hyperbola Quintessence]]
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[GPU]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref>
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[GPU]]

=External Links=
* [https://en.wikipedia.org/wiki/Kogge-Stone_adder Kogge-Stone adder from Wikipedia]
* [http://terra.fendrich.se/Terra%20Help-1763.htm Kogge-Stone algorithm] from [http://terra.fendrich.se/Terra%20Help-1647.htm Terra Help and Information - Theories] by [[Peter Fendrich]]
* [[Videos#Gong|Gong]] - [https://en.wikipedia.org/wiki/Shapeshifter_%28Gong_album%29 Shapeshifter] <ref>[https://en.wikipedia.org/wiki/Shapeshifting Shapeshifting from Wikipedia]</ref>, 1992, [https://en.wikipedia.org/wiki/YouTube YouTube] Video
: {{#evu:https://www.youtube.com/watch?v=0AZVy2coi6c|alignment=left|valignment=top}}

=References=
<references />

'''[[Sliding Piece Attacks|Up one Level]]'''

Navigation menu