Difference between revisions of "General Setwise Operations"

From Chessprogramming wiki
Jump to: navigation, search
(Created page with "'''Home * Board Representation * Bitboards * General Setwise Operations''' FILE:Upward by Vasily Kandinsky, 1929.jpg|border|right|thumb| |[[Arts#Kandi...")
 
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
'''[[Main Page|Home]] * [[Board Representation]] * [[Bitboards]] * General Setwise Operations'''
 
'''[[Main Page|Home]] * [[Board Representation]] * [[Bitboards]] * General Setwise Operations'''
  
[[FILE:Upward by Vasily Kandinsky, 1929.jpg|border|right|thumb|
+
[[FILE:Upward by Vasily Kandinsky, 1929.jpg|border|right|thumb|[[:Category:Wassily Kandinsky|Wassily Kandinsky]] - Upward, 1929 <ref>[[:Category:Wassily Kandinsky|Wassily Kandinsky]] - Upward, 1929, [https://en.wikipedia.org/wiki/Peggy_Guggenheim_Collection Peggy Guggenheim Collection], [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia COmmons]</ref> ]]
|[[Arts#Kandinsky|Wassily Kandinsky]] - Upward, 1929 <ref>[[Arts#Kandinsky|Wassily Kandinsky]] - Upward, 1929, [https://en.wikipedia.org/wiki/Peggy_Guggenheim_Collection Peggy Guggenheim Collection], [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia COmmons]</ref> ]]
 
 
   
 
   
 
'''General Setwise Operations''',<br/>
 
'''General Setwise Operations''',<br/>
Line 19: Line 18:
 
if (a != b) -> both sets are not equal
 
if (a != b) -> both sets are not equal
 
</pre>
 
</pre>
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
[[x86]] has a cmp-instruction, which internally performs a subtraction to set its internal processor flags (carry, zero, overflow) accordantly, for instance the zero-flag if both sets are equal. Those flags are then used by conditional jump or move instructions.
 
[[x86]] has a cmp-instruction, which internally performs a subtraction to set its internal processor flags (carry, zero, overflow) accordantly, for instance the zero-flag if both sets are equal. Those flags are then used by conditional jump or move instructions.
 
<pre>
 
<pre>
Line 42: Line 41:
 
  set-wise        = {a1, b1, c1, d1, ....., e8, f8, g8, h8}
 
  set-wise        = {a1, b1, c1, d1, ....., e8, f8, g8, h8}
 
</pre>
 
</pre>
as [https://en.wikipedia.org/wiki/Venn_diagram Venn diagram]
+
as bitboard diagrams and [https://en.wikipedia.org/wiki/Venn_diagram Venn diagrams]  
{|
+
[[FILE:Venn0000.svg|border|right|thumb|240px|Empty]]  
|-
+
<pre>              
| [[FILE:Venn0000.svg|none|border|text-bottom|240px|link=https://en.wikipedia.org/wiki/Venn_diagram]]
+
       Empty                 Universe
| [[FILE:Venn1111.svg|none|border|text-bottom|240px|link=https://en.wikipedia.org/wiki/Venn_diagram]]
+
  . . . . . . . .       1 1 1 1 1 1 1 1
|}
+
  . . . . . . . .       1 1 1 1 1 1 1 1
or bitboard diagrams
+
  . . . . . . . .       1 1 1 1 1 1 1 1
<pre>
+
  . . . . . . . .       1 1 1 1 1 1 1 1
       Empty               Universe
+
  . . . . . . . .       1 1 1 1 1 1 1 1
  . . . . . . . .     1 1 1 1 1 1 1 1
+
  . . . . . . . .       1 1 1 1 1 1 1 1
  . . . . . . . .     1 1 1 1 1 1 1 1
+
  . . . . . . . .       1 1 1 1 1 1 1 1
  . . . . . . . .     1 1 1 1 1 1 1 1
+
  . . . . . . . .       1 1 1 1 1 1 1 1
  . . . . . . . .     1 1 1 1 1 1 1 1
 
  . . . . . . . .     1 1 1 1 1 1 1 1
 
  . . . . . . . .     1 1 1 1 1 1 1 1
 
  . . . . . . . .     1 1 1 1 1 1 1 1
 
  . . . . . . . .     1 1 1 1 1 1 1 1
 
 
</pre>
 
</pre>
 
+
[[FILE:Venn1111.svg|border|right|thumb|240px|Universe]]
 
Programmers often wonder to use -1 in [[C]], [[Cpp|C++]] as unsigned constant. See [[General Setwise Operations#TheTwosComplement|The Two's Complement]] - alternately one may use ~0 to define the universal set. Since in [[C]] or [[Cpp|C++]], decimal numbers without ULL suffix are treated as 32-bit integers, constants outside the integer range need some care concerning sign or zero extension. Const declarations or using the [[Bitboards#DefiningBitboards|C64 Macro]] is recommended:
 
Programmers often wonder to use -1 in [[C]], [[Cpp|C++]] as unsigned constant. See [[General Setwise Operations#TheTwosComplement|The Two's Complement]] - alternately one may use ~0 to define the universal set. Since in [[C]] or [[Cpp|C++]], decimal numbers without ULL suffix are treated as 32-bit integers, constants outside the integer range need some care concerning sign or zero extension. Const declarations or using the [[Bitboards#DefiningBitboards|C64 Macro]] is recommended:
 
<pre>
 
<pre>
Line 85: Line 79:
 
<span id="Intersection"></span>
 
<span id="Intersection"></span>
 
==Intersection==  
 
==Intersection==  
[[FILE:Venn0001.svg|none|border|text-bottom|240px|link=https://en.wikipedia.org/wiki/Venn_diagram]]
+
[[FILE:Venn0001.svg|border|right|thumb|240px|Intersection]]
  
 
In [https://en.wikipedia.org/wiki/Set_theory set theory] [https://en.wikipedia.org/wiki/Intersection_%28set_theory%29 intersection] is denoted as:
 
In [https://en.wikipedia.org/wiki/Set_theory set theory] [https://en.wikipedia.org/wiki/Intersection_%28set_theory%29 intersection] is denoted as:
  A &#8745; B
+
  <span style="font-size: 140%;">A &#8745; B</span>
  
 
In [https://en.wikipedia.org/wiki/Boolean_algebra_%28logic%29 boolean algebra] [https://en.wikipedia.org/wiki/Logical_conjunction conjunction] is denoted as:
 
In [https://en.wikipedia.org/wiki/Boolean_algebra_%28logic%29 boolean algebra] [https://en.wikipedia.org/wiki/Logical_conjunction conjunction] is denoted as:
  a &#8743; b
+
  <span style="font-size: 140%;">a &#8743; b</span>
  
 
Bitboard intersection or conjunction is performed by [https://en.wikipedia.org/wiki/Bitwise_operation#AND bitwise and] (binary operator & in [[C]], [[Cpp|C++]] or [[Java]], and the keyword "AND" in [[Pascal]]).
 
Bitboard intersection or conjunction is performed by [https://en.wikipedia.org/wiki/Bitwise_operation#AND bitwise and] (binary operator & in [[C]], [[Cpp|C++]] or [[Java]], and the keyword "AND" in [[Pascal]]).
 
  intersection = a & b
 
  intersection = a & b
  
'''Truth Table'''
+
'''Truth Table'''<br/>
 
Truth table of [https://en.wikipedia.org/wiki/AND_gate and] for one bit, for a '1' result both inputs need to be '1':
 
Truth table of [https://en.wikipedia.org/wiki/AND_gate and] for one bit, for a '1' result both inputs need to be '1':
 
{| class="wikitable"
 
{| class="wikitable"
Line 122: Line 116:
 
Conjunction acts like a bitwise minimum, min(a, b) or as bitwise multiplication (a * b).
 
Conjunction acts like a bitwise minimum, min(a, b) or as bitwise multiplication (a * b).
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
[[x86]] has general purpose instruction as well as [[SIMD and SWAR Techniques|SIMD-instructions]] for bitwise and:
 
[[x86]] has general purpose instruction as well as [[SIMD and SWAR Techniques|SIMD-instructions]] for bitwise and:
 
<pre>
 
<pre>
Line 132: Line 126:
 
vpand ymm0, ymm1, ymm2 ; AVX2 ymm0 = ymm1 & ymm2
 
vpand ymm0, ymm1, ymm2 ; AVX2 ymm0 = ymm1 & ymm2
 
</pre>
 
</pre>
[[SSE2]]-intrinsic [http://msdn2.microsoft.com/en-us/library/6d1txsa8%28VS.80%29.aspx _mm_and_si128].
+
[[SSE2]]-intrinsic [[SSE2#_mm_and_si128|_mm_and_si128]]<br/>
[[AVX2]]-intrinsic [https://software.intel.com/en-us/node/695037 _mm256_and_si256]
+
[[AVX2]]-intrinsic [[AVX2#_mm256_and_si256|_mm256_and_si256]]<br/>
 
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
 
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
  
'''Idempotent'''  
+
'''Idempotent''' <br/>
 
Conjunction is [https://en.wikipedia.org/wiki/Idempotence idempotent].
 
Conjunction is [https://en.wikipedia.org/wiki/Idempotence idempotent].
 
  a & a == a
 
  a & a == a
'''Commutative'''  
+
'''Commutative''' <br/>
 
Conjunction is [https://en.wikipedia.org/wiki/Commutative commutative]
 
Conjunction is [https://en.wikipedia.org/wiki/Commutative commutative]
 
  a & b == b & a
 
  a & b == b & a
'''Associative'''  
+
'''Associative'''<br/>
 
Conjunction is [https://en.wikipedia.org/wiki/Associative associative].
 
Conjunction is [https://en.wikipedia.org/wiki/Associative associative].
 
  (a & b) & c == a & (b & c)
 
  (a & b) & c == a & (b & c)
'''Subset'''  
+
'''Subset''' <br/>
 
The intersection of two sets is [https://en.wikipedia.org/wiki/Subset subset] of both.
 
The intersection of two sets is [https://en.wikipedia.org/wiki/Subset subset] of both.
  
Line 165: Line 159:
 
</pre>
 
</pre>
 
<span id="DisjointSets"></span>
 
<span id="DisjointSets"></span>
'''Disjoint Sets'''  
+
'''Disjoint Sets''' <br/>
 
To test whether two sets are [https://en.wikipedia.org/wiki/Disjoint disjoint] - that is their intersection is empty - compiler emit the [[x86]] test-instruction instead of and. That saves the content of a register, if the intersection is not otherwise needed:
 
To test whether two sets are [https://en.wikipedia.org/wiki/Disjoint disjoint] - that is their intersection is empty - compiler emit the [[x86]] test-instruction instead of and. That saves the content of a register, if the intersection is not otherwise needed:
 
<pre>
 
<pre>
Line 173: Line 167:
 
<span id="Union"></span>
 
<span id="Union"></span>
 
==Union==  
 
==Union==  
[[FILE:Venn0111.svg|none|border|text-bottom|240px|link=https://en.wikipedia.org/wiki/Venn_diagram]]
+
[[FILE:Venn0111.svg|border|right|thumb|240px|Union]]
 
 
 
In [https://en.wikipedia.org/wiki/Set_theory set theory] [https://en.wikipedia.org/wiki/Union_%28set_theory%29 union] is denoted as:
 
In [https://en.wikipedia.org/wiki/Set_theory set theory] [https://en.wikipedia.org/wiki/Union_%28set_theory%29 union] is denoted as:
  A &#8746; B
+
  <span style="font-size: 140%;">A &#8746; B</span>
 
In [https://en.wikipedia.org/wiki/Boolean_algebra_%28logic%29 boolean algebra] [https://en.wikipedia.org/wiki/Logical_disjunction disjunction] is denoted as:
 
In [https://en.wikipedia.org/wiki/Boolean_algebra_%28logic%29 boolean algebra] [https://en.wikipedia.org/wiki/Logical_disjunction disjunction] is denoted as:
  a &#8744; b
+
  <span style="font-size: 140%;">a &#8744; b</span>
 
The union or disjunction of two bitboards is applied by [https://en.wikipedia.org/wiki/Bitwise_operation#OR bitwise or] (binary operator | in [[C]], [[Cpp|C++]] or [[Java]], or the keyword "OR" in [[Pascal]]). The union is superset of the [[General Setwise Operations#Intersection|intersection]], while the [[General Setwise Operations#Intersection|intersection]] is [https://en.wikipedia.org/wiki/Subset subset] of the union.
 
The union or disjunction of two bitboards is applied by [https://en.wikipedia.org/wiki/Bitwise_operation#OR bitwise or] (binary operator | in [[C]], [[Cpp|C++]] or [[Java]], or the keyword "OR" in [[Pascal]]). The union is superset of the [[General Setwise Operations#Intersection|intersection]], while the [[General Setwise Operations#Intersection|intersection]] is [https://en.wikipedia.org/wiki/Subset subset] of the union.
 
  union = a | b
 
  union = a | b
'''Truth Table'''  
+
'''Truth Table''' <br/>
 
Truth table of [https://en.wikipedia.org/wiki/OR_gate or] for one bit, one set input bits is sufficient to set the output:
 
Truth table of [https://en.wikipedia.org/wiki/OR_gate or] for one bit, one set input bits is sufficient to set the output:
 
{| class="wikitable"
 
{| class="wikitable"
Line 207: Line 200:
 
Disjunction acts like bitwise maximum, max(a, b) or as addition with saturation, min(a + b, 1). It can also be interpreted as sum minus product, a + b - a*b, with possible temporary overflow of one binary digit to two - or with modulo 2 arithmetic.
 
Disjunction acts like bitwise maximum, max(a, b) or as addition with saturation, min(a + b, 1). It can also be interpreted as sum minus product, a + b - a*b, with possible temporary overflow of one binary digit to two - or with modulo 2 arithmetic.
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
[[x86]] has general purpose instruction as well as [[SIMD and SWAR Techniques|SIMD-instructions]] for bitwise or:
 
[[x86]] has general purpose instruction as well as [[SIMD and SWAR Techniques|SIMD-instructions]] for bitwise or:
 
<pre>
 
<pre>
Line 216: Line 209:
 
vpor ymm0, ymm1, ymm2 ; AVX2 ymm0  = ymm1 | ymm2
 
vpor ymm0, ymm1, ymm2 ; AVX2 ymm0  = ymm1 | ymm2
 
</pre>
 
</pre>
[[SSE2]]-intrinsic [http://msdn2.microsoft.com/en-us/library/ew8ty0db%28VS.80%29.aspx _mm_or_si128].
+
[[SSE2]]-intrinsic [[SSE2#_mm_or_si128|_mm_or_si128]]<br/>
[[AVX2]]-intrinsic [https://software.intel.com/en-us/node/523912 _mm256_or_si256]
+
[[AVX2]]-intrinsic [[AVX2#_mm256_or_si256|_mm256_or_si256]]<br/>
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
+
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]<br/>
  
'''Idempotent'''  
+
'''Idempotent''' <br/>
 
Disjunction is [https://en.wikipedia.org/wiki/Idempotence idempotent].
 
Disjunction is [https://en.wikipedia.org/wiki/Idempotence idempotent].
 
  a | a == a
 
  a | a == a
'''Commutative'''  
+
'''Commutative''' <br/>
 
Disjunction is [https://en.wikipedia.org/wiki/Commutative commutative]
 
Disjunction is [https://en.wikipedia.org/wiki/Commutative commutative]
 
  a | b == b | a
 
  a | b == b | a
'''Associative'''  
+
'''Associative''' <br/>
 
Disjunction is [https://en.wikipedia.org/wiki/Associative associative].
 
Disjunction is [https://en.wikipedia.org/wiki/Associative associative].
 
  (a | b) | c == a | (b | c)
 
  (a | b) | c == a | (b | c)
 
<span id="DistributiveAndOr"></span>
 
<span id="DistributiveAndOr"></span>
'''Distributive'''  
+
'''Distributive''' <br/>
 
Disjunction is [https://en.wikipedia.org/wiki/Distributivity distributive] over [[General Setwise Operations#Intersection|conjunction]] and vice versa:
 
Disjunction is [https://en.wikipedia.org/wiki/Distributivity distributive] over [[General Setwise Operations#Intersection|conjunction]] and vice versa:
 
  x | (y & z) == (x | y) & (x | z)
 
  x | (y & z) == (x | y) & (x | z)
 
  x & (y | z) == (x & y) | (x & z)
 
  x & (y | z) == (x & y) | (x & z)
'''Superset'''  
+
'''Superset''' <br/>
 
The union of two sets is superset of both. For instance the union of all white and black pieces are the set of all occupied squares:
 
The union of two sets is superset of both. For instance the union of all white and black pieces are the set of all occupied squares:
 
<pre>
 
<pre>
Line 250: Line 243:
 
<span id="ComplementSet"></span>
 
<span id="ComplementSet"></span>
 
==Complement Set==  
 
==Complement Set==  
[[FILE:Venn1010.svg|none|border|text-bottom|240px|link=https://en.wikipedia.org/wiki/Venn_diagram]]
+
[[FILE:Venn1010.svg|border|right|thumb|240px|Complement]]
 
In [https://en.wikipedia.org/wiki/Set_theory set theory] [https://en.wikipedia.org/wiki/Complement_%28set_theory%29 complement set] is denoted as:
 
In [https://en.wikipedia.org/wiki/Set_theory set theory] [https://en.wikipedia.org/wiki/Complement_%28set_theory%29 complement set] is denoted as:
  A<span style="vertical-align: super;">&#8705;</span>
+
  <span style="font-size: 140%;">A</span><span style="vertical-align: super;">&#8705;</span>
 
In [https://en.wikipedia.org/wiki/Boolean_algebra_%28logic%29 boolean algebra] [https://en.wikipedia.org/wiki/Negation negation] is denoted as:
 
In [https://en.wikipedia.org/wiki/Boolean_algebra_%28logic%29 boolean algebra] [https://en.wikipedia.org/wiki/Negation negation] is denoted as:
  &#172;a
+
  <span style="font-size: 140%;">&#172;a</span>
 
The complement set (absolute complement set), negation or [https://en.wikipedia.org/wiki/Ones%27_complement#Ones.27_complement ones' complement] has it's equivalent in [https://en.wikipedia.org/wiki/Bitwise_operation#NOT bitwise not] (unary operator '~' in [[C]], [[Cpp|C++]] or [[Java]], or the keyword "NOT" in [[Pascal]]).
 
The complement set (absolute complement set), negation or [https://en.wikipedia.org/wiki/Ones%27_complement#Ones.27_complement ones' complement] has it's equivalent in [https://en.wikipedia.org/wiki/Bitwise_operation#NOT bitwise not] (unary operator '~' in [[C]], [[Cpp|C++]] or [[Java]], or the keyword "NOT" in [[Pascal]]).
  
'''Truth Table'''  
+
'''Truth Table''' <br/>
 
Truth table of [https://en.wikipedia.org/wiki/NOT_gate not] for one bit:
 
Truth table of [https://en.wikipedia.org/wiki/NOT_gate not] for one bit:
 
{| class="wikitable"
 
{| class="wikitable"
Line 272: Line 265:
 
The complement can be interpreted as bitwise subtraction (1 - a).
 
The complement can be interpreted as bitwise subtraction (1 - a).
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
Available as general purpose instruction.
 
Available as general purpose instruction.
<pre>
+
not  rax ; rax = ~rax
not  rax ; rax = ~rax
 
</pre>
 
 
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
 
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
  
'''Empty Squares'''  
+
'''Empty Squares'''<br/>
 
The set of empty squares for instance is the complement-set of all occupied squares and vice versa:
 
The set of empty squares for instance is the complement-set of all occupied squares and vice versa:
 
<pre>
 
<pre>
Line 299: Line 290:
 
  !-1 == 0
 
  !-1 == 0
 
<span id="Complementlaws"></span>
 
<span id="Complementlaws"></span>
'''Complement laws'''  
+
'''Complement laws'''<br/>
 
* The [[General Setwise Operations#Union|union]] of a set with it's complement is the universal set -1.
 
* The [[General Setwise Operations#Union|union]] of a set with it's complement is the universal set -1.
 
* The [[General Setwise Operations#Intersection|intersection]] of a set with it's complement is the empty set 0 - both are [https://en.wikipedia.org/wiki/Disjoint disjoint].
 
* The [[General Setwise Operations#Intersection|intersection]] of a set with it's complement is the empty set 0 - both are [https://en.wikipedia.org/wiki/Disjoint disjoint].
Line 308: Line 299:
 
  ~(-1)  ==  0
 
  ~(-1)  ==  0
 
<span id="DeMorganslaws"></span>
 
<span id="DeMorganslaws"></span>
'''De Morgan's laws'''  
+
'''De Morgan's laws''' <br/>
 
* Complement of [[General Setwise Operations#Union|union]] ([https://en.wikipedia.org/wiki/NOR_gate NOR] ) is the [[General Setwise Operations#Intersection|intersection]] of the complements <ref>[[Mathematician#ADeMorgan|Augustus De Morgan]] ('''1860'''). ''[http://books.google.com/books?id=Od3jgF5rZtgC Syllabus of a Proposed System of Logic]''. Walton & Malbery</ref>.
 
* Complement of [[General Setwise Operations#Union|union]] ([https://en.wikipedia.org/wiki/NOR_gate NOR] ) is the [[General Setwise Operations#Intersection|intersection]] of the complements <ref>[[Mathematician#ADeMorgan|Augustus De Morgan]] ('''1860'''). ''[http://books.google.com/books?id=Od3jgF5rZtgC Syllabus of a Proposed System of Logic]''. Walton & Malbery</ref>.
 
* Complement of [[General Setwise Operations#Intersection|intersection]] ([https://en.wikipedia.org/wiki/NAND_logic NAND] or [https://en.wikipedia.org/wiki/Sheffer_stroke Sheffer stroke] ) is the [[General Setwise Operations#Union|union]] of the complements.
 
* Complement of [[General Setwise Operations#Intersection|intersection]] ([https://en.wikipedia.org/wiki/NAND_logic NAND] or [https://en.wikipedia.org/wiki/Sheffer_stroke Sheffer stroke] ) is the [[General Setwise Operations#Union|union]] of the complements.
Line 316: Line 307:
 
<span id="RelativeComplement"></span>
 
<span id="RelativeComplement"></span>
 
==Relative Complement==
 
==Relative Complement==
[[FILE:Venn0010.svg|none|border|text-bottom|240px|link=https://en.wikipedia.org/wiki/Venn_diagram]]
+
[[FILE:Venn0010.svg|border|right|thumb|240px|Relative Complement]]
 
In [https://en.wikipedia.org/wiki/Set_theory set theory] [https://en.wikipedia.org/wiki/Complement_%28set_theory%29#Relative_complement relative complement] is denoted as:
 
In [https://en.wikipedia.org/wiki/Set_theory set theory] [https://en.wikipedia.org/wiki/Complement_%28set_theory%29#Relative_complement relative complement] is denoted as:
  A<span style="vertical-align: super;">&#8705;</span>&B = B\A
+
  <span style="font-size: 140%;">A</span><span style="vertical-align: super;">&#8705;</span><span style="font-size: 140%;"> &#8745; B = B \ A</span>
 
The relative complement is the [[General Setwise Operations#ComplementSet|absolute complement]] restricted to some other set. The relative complement of 'a' inside 'b' is also known as the '''set theoretic difference''' of 'b' minus 'a'. It is the set of all elements that belong to 'b' but '''not''' to 'a'. Also called 'b' without 'a'. It is the [[General Setwise Operations#Intersection|intersection]] of 'b' with the absolute complement of 'a'.
 
The relative complement is the [[General Setwise Operations#ComplementSet|absolute complement]] restricted to some other set. The relative complement of 'a' inside 'b' is also known as the '''set theoretic difference''' of 'b' minus 'a'. It is the set of all elements that belong to 'b' but '''not''' to 'a'. Also called 'b' without 'a'. It is the [[General Setwise Operations#Intersection|intersection]] of 'b' with the absolute complement of 'a'.
 
  not_a_in_b  = ~a &  b
 
  not_a_in_b  = ~a &  b
 
  b_without_a =  b & ~a
 
  b_without_a =  b & ~a
'''Truth Table'''  
+
'''Truth Table'''<br/>
 
Truth table of relative complement for one bit:
 
Truth table of relative complement for one bit:
 
{| class="wikitable"
 
{| class="wikitable"
Line 348: Line 339:
 
The relative complement of 'a' in 'b' may be interpreted as a bitwise (a < b) relation.
 
The relative complement of 'a' in 'b' may be interpreted as a bitwise (a < b) relation.
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
[[x86]] don't has an own general purpose instruction for relative complement, but [[x86-64]] expansion [[BMI1]], and [[SIMD and SWAR Techniques|SIMD-instructions]]:
 
[[x86]] don't has an own general purpose instruction for relative complement, but [[x86-64]] expansion [[BMI1]], and [[SIMD and SWAR Techniques|SIMD-instructions]]:
 
<pre>
 
<pre>
Line 357: Line 348:
 
vpandn ymm0, ymm1, ymm2 ; AVX  xmm0 = ~xmm1 & xmm2
 
vpandn ymm0, ymm1, ymm2 ; AVX  xmm0 = ~xmm1 & xmm2
 
</pre>
 
</pre>
[[SSE2]]-intrinsic [http://msdn2.microsoft.com/en-us/library/1beaceh8%28VS.80%29.aspx _mm_andnot_si128].
+
[[SSE2]]-intrinsic [[SSE2#_mm_andnot_si128|_mm_andnot_si128]]<br/>
[[AVX2]]-intrinsic [https://software.intel.com/en-us/node/523911 _mm256_andnot_si256]
+
[[AVX2]]-intrinsic [[AVX2#_mm256_andnot_si256|_mm256_andnot_si256]]<br/>
 
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
 
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
  
'''Super minus Sub'''  
+
'''Super minus Sub'''<br/> 
 
In presumption of [[General Setwise Operations#XorWithout|subtraction or exclusive or]] there are alternatives to calculate the relative complement - superset minus subset. We can take either the union without the complementing set - or the other set without the intersection
 
In presumption of [[General Setwise Operations#XorWithout|subtraction or exclusive or]] there are alternatives to calculate the relative complement - superset minus subset. We can take either the union without the complementing set - or the other set without the intersection
 
  ~a & b == ( a | b ) - a
 
  ~a & b == ( a | b ) - a
Line 367: Line 358:
  
 
==Implication==  
 
==Implication==  
[[FILE:Venn1011.svg|none|border|text-bottom|240px|link=https://en.wikipedia.org/wiki/Venn_diagram]]
+
[[FILE:Venn1011.svg|border|right|thumb|240px|Implication]]
 
Logical Implication or [https://en.wikipedia.org/wiki/Entailment Entailment] is denoted as:
 
Logical Implication or [https://en.wikipedia.org/wiki/Entailment Entailment] is denoted as:
  A &#8658; B
+
  <span style="font-size: 140%;">A &#8658; B</span>
 
The boolean [https://en.wikipedia.org/wiki/Material_conditional Material conditional] is denoted as:
 
The boolean [https://en.wikipedia.org/wiki/Material_conditional Material conditional] is denoted as:
  a &#8594; b
+
  <span style="font-size: 140%;">a &#8594; b</span>
 
Logical Implication or the boolean Material conditional 'a' implies 'b' (if 'a' then 'b') is an derived boolean operation, implemented as [[General Setwise Operations#Union|union]] of the [[General Setwise Operations#ComplementSet|absolute complement]] of 'a' with 'b':
 
Logical Implication or the boolean Material conditional 'a' implies 'b' (if 'a' then 'b') is an derived boolean operation, implemented as [[General Setwise Operations#Union|union]] of the [[General Setwise Operations#ComplementSet|absolute complement]] of 'a' with 'b':
 
  a_implies_b ==  ~a | b
 
  a_implies_b ==  ~a | b
Line 400: Line 391:
 
Implication may be interpreted as a bitwise (a <= b) relation.
 
Implication may be interpreted as a bitwise (a <= b) relation.
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
+
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]<br/>
 
<span id="ExclusiveOr"></span>
 
<span id="ExclusiveOr"></span>
 
==Exclusive Or==  
 
==Exclusive Or==  
[[FILE:Venn0110.svg|none|border|text-bottom|240px|link=https://en.wikipedia.org/wiki/Venn_diagram]]
+
[[FILE:Venn0110.svg|border|right|thumb|240px|Exclusive Or]]
 
In [https://en.wikipedia.org/wiki/Set_theory set theory] [https://en.wikipedia.org/wiki/Symmetric_difference symmetric difference] is denoted as:
 
In [https://en.wikipedia.org/wiki/Set_theory set theory] [https://en.wikipedia.org/wiki/Symmetric_difference symmetric difference] is denoted as:
  A &#8710; B
+
  <span style="font-size: 140%;">A &#8710; B</span>
 
In [https://en.wikipedia.org/wiki/Boolean_algebra_%28logic%29 boolean algebra] [https://en.wikipedia.org/wiki/Exclusive_or Exclusive or] is denoted as:
 
In [https://en.wikipedia.org/wiki/Boolean_algebra_%28logic%29 boolean algebra] [https://en.wikipedia.org/wiki/Exclusive_or Exclusive or] is denoted as:
  a &#8853; b
+
  <span style="font-size: 140%;">a &#8853; b</span>
Exclusive or, also exclusive disjunction (xor, binary operator '^' in [[C]], [[Cpp|C++]] or [[Java]], or the keyword "XOR" in [[Pascal]]), also called symmetric difference, leaves all elements which are exclusively set in one of the two sets. Xor is really a multi purpose operation with a lot of applications not only bitboards of course.
+
Exclusive or, also exclusive disjunction (xor, binary operator '^' in [[C]], [[Cpp|C++]] or [[Java]], or the keyword "XOR" in [[Pascal]]),
 +
xor = a ^ b
 +
also called symmetric difference, leaves all elements which are exclusively set in one of the two sets. Xor is really a multi purpose operation with a lot of applications not only bitboards of course.
 
<pre>
 
<pre>
 
1 . . . . . . 1    . . . . . . . .    1 . . . . . . 1
 
1 . . . . . . 1    . . . . . . . .    1 . . . . . . 1
Line 420: Line 413:
 
1 . . . . . . 1    . . . . . . . .    1 . . . . . . 1
 
1 . . . . . . 1    . . . . . . . .    1 . . . . . . 1
 
</pre>
 
</pre>
'''Truth Table'''  
+
'''Truth Table''' <br/>
 
Truth table of [https://en.wikipedia.org/wiki/XOR_gate exclusive or] for one bit:
 
Truth table of [https://en.wikipedia.org/wiki/XOR_gate exclusive or] for one bit:
 
{| class="wikitable"
 
{| class="wikitable"
Line 448: Line 441:
 
It also acts like a bitwise subtraction (modulo 2).
 
It also acts like a bitwise subtraction (modulo 2).
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
[[x86]] has general purpose instruction as well as [[SIMD and SWAR Techniques|SIMD-instructions]] for bitwise exclusive or:
 
[[x86]] has general purpose instruction as well as [[SIMD and SWAR Techniques|SIMD-instructions]] for bitwise exclusive or:
 
<pre>
 
<pre>
Line 457: Line 450:
 
vpxor ymm0, ymm1, ymm2 ; AVX2 ymm0  = ymm1 ^ ymm2
 
vpxor ymm0, ymm1, ymm2 ; AVX2 ymm0  = ymm1 ^ ymm2
 
</pre>
 
</pre>
[[SSE2]]-intrinsic [http://msdn2.microsoft.com/en-us/library/fzt08www%28VS.80%29.aspx _mm_xor_si128].
+
[[SSE2]]-intrinsic [[SSE2#_mm_xor_si128|_mm_xor_si128]]<br/>
[[AVX2]]-intrinsic [https://software.intel.com/en-us/node/683570 _mm256_xor_si256]
+
[[AVX2]]-intrinsic [[AVX2#_mm256_xor_si256|_mm256_xor_si256]]<br/>
 
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
 
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
  
'''Commutative'''  
+
'''Commutative''' <br/>
 
Exclusive disjunction is [https://en.wikipedia.org/wiki/Commutative commutative]
 
Exclusive disjunction is [https://en.wikipedia.org/wiki/Commutative commutative]
 
  a ^ b == b ^ a
 
  a ^ b == b ^ a
'''Associative'''  
+
'''Associative'''<br/>
 
Xor is [https://en.wikipedia.org/wiki/Associative associative] as well.
 
Xor is [https://en.wikipedia.org/wiki/Associative associative] as well.
 
  (a ^ b) ^ c == a ^ (b ^ c)
 
  (a ^ b) ^ c == a ^ (b ^ c)
'''Distributive'''  
+
 
 +
'''Distributive''' <br/>
 
[[General Setwise Operations#Intersection|Conjunction]] is [https://en.wikipedia.org/wiki/Distributivity distributive] over exclusive disjunction - but '''not''' vice versa, since conjunction acts like multiplication, while xor acts as addition in the [https://en.wikipedia.org/wiki/Finite_field Galois field] [https://en.wikipedia.org/wiki/GF%282%29 GF(2)] :
 
[[General Setwise Operations#Intersection|Conjunction]] is [https://en.wikipedia.org/wiki/Distributivity distributive] over exclusive disjunction - but '''not''' vice versa, since conjunction acts like multiplication, while xor acts as addition in the [https://en.wikipedia.org/wiki/Finite_field Galois field] [https://en.wikipedia.org/wiki/GF%282%29 GF(2)] :
 
  x & (y ^ z) == (x & y) ^ (x & z)
 
  x & (y ^ z) == (x & y) ^ (x & z)
'''Own Inverse'''  
+
 
 +
'''Own Inverse''' <br/>
 
If applied two (even) times with the same operand, xor restores the original result. It is own inverse or an [https://en.wikipedia.org/wiki/Involution involution] .
 
If applied two (even) times with the same operand, xor restores the original result. It is own inverse or an [https://en.wikipedia.org/wiki/Involution involution] .
'''Subset'''  
+
 
 +
'''Subset'''<br/>
 
If one operand is subset of the other, xor (or subtraction) implements the [[General Setwise Operations#RelativeComplement|relative complement]].
 
If one operand is subset of the other, xor (or subtraction) implements the [[General Setwise Operations#RelativeComplement|relative complement]].
 
<pre>
 
<pre>
Line 485: Line 481:
 
. . . . . . . .    . . . . . . . .    . . . . . . . .
 
. . . . . . . .    . . . . . . . .    . . . . . . . .
 
</pre>
 
</pre>
'''Subtraction'''  
+
'''Subtraction'''<br/>
 
While commutative, xor is a better replacement for subtracting from power of two minus one values, such as 63.
 
While commutative, xor is a better replacement for subtracting from power of two minus one values, such as 63.
  (2**n - 1) - a == a ^ (2**n - 1) with a subset of 2**n - 1
+
  (2<span style="vertical-align: super;">n</span> - 1) - a == a ^ (2<span style="vertical-align: super;">n</span> - 1) with a subset of 2<span style="vertical-align: super;">n</span> - 1
 
This is because it usually safes one [[x86]] load instruction and an additional register, but uses opcodes with immediate operands - for instance:
 
This is because it usually safes one [[x86]] load instruction and an additional register, but uses opcodes with immediate operands - for instance:
 
<pre>
 
<pre>
Line 499: Line 495:
 
-1 - a == a ^ -1
 
-1 - a == a ^ -1
 
</pre>
 
</pre>
'''Or without And'''  
+
'''Or without And''' <br/>
 
Xor is the same as a [[General Setwise Operations#Union|union]] without the [[General Setwise Operations#Intersection|intersection]] - all the bits different, 0,1 or 1,0. Since the [[General Setwise Operations#Intersection|intersection]] is subset of the [[General Setwise Operations#Union|union]], xor or subtraction can replace the "without" operation & ~:
 
Xor is the same as a [[General Setwise Operations#Union|union]] without the [[General Setwise Operations#Intersection|intersection]] - all the bits different, 0,1 or 1,0. Since the [[General Setwise Operations#Intersection|intersection]] is subset of the [[General Setwise Operations#Union|union]], xor or subtraction can replace the "without" operation & ~:
 
  a ^ b == (a | b) &~(a & b)
 
  a ^ b == (a | b) &~(a & b)
 
  a ^ b == (a | b) ^ (a & b)
 
  a ^ b == (a | b) ^ (a & b)
 
  a ^ b == (a | b) - (a & b)
 
  a ^ b == (a | b) - (a & b)
'''Disjoint Sets'''  
+
 
 +
'''Disjoint Sets''' <br/>
 
The symmetric difference of disjoint sets is equal to the [[General Setwise Operations#Union|union]] or arithmetical addition. Since [[General Setwise Operations#Intersection|intersection]] and symmetric difference are disjoint, the union might defined that way:
 
The symmetric difference of disjoint sets is equal to the [[General Setwise Operations#Union|union]] or arithmetical addition. Since [[General Setwise Operations#Intersection|intersection]] and symmetric difference are disjoint, the union might defined that way:
 
  a | b = ( a & b ) ^ ( a ^ b )
 
  a | b = ( a & b ) ^ ( a ^ b )
Line 511: Line 508:
 
  a | b = ( a & b ) + ( a ^ b )
 
  a | b = ( a & b ) + ( a ^ b )
 
Assume we have distinct attack sets of pawns in left or right [[Direction|direction]]. The set of all squares attacked by two pawns is the intersection, the set exclusively attacked by one pawn (either right or left) is the xor-sum, while all squares attacked by any pawn is the union, see [[Pawn Attacks (Bitboards)#PawnAttacks|pawn attacks]].
 
Assume we have distinct attack sets of pawns in left or right [[Direction|direction]]. The set of all squares attacked by two pawns is the intersection, the set exclusively attacked by one pawn (either right or left) is the xor-sum, while all squares attacked by any pawn is the union, see [[Pawn Attacks (Bitboards)#PawnAttacks|pawn attacks]].
'''Union of Complements'''  
+
 
 +
'''Union of Complements'''<br/>
 
The symmetric difference is equivalent to the [[General Setwise Operations#Union|union]] of both [[General Setwise Operations#RelativeComplement|relative complements]]. Since both [[General Setwise Operations#RelativeComplement|relative complements]] are [https://en.wikipedia.org/wiki/Disjoint disjoint], bitwise or or add can replaced by xor itself:
 
The symmetric difference is equivalent to the [[General Setwise Operations#Union|union]] of both [[General Setwise Operations#RelativeComplement|relative complements]]. Since both [[General Setwise Operations#RelativeComplement|relative complements]] are [https://en.wikipedia.org/wiki/Disjoint disjoint], bitwise or or add can replaced by xor itself:
 
  a ^ b == (a & ~b) | (b & ~a)
 
  a ^ b == (a & ~b) | (b & ~a)
 
  a ^ b == (a & ~b) ^ (b & ~a)
 
  a ^ b == (a & ~b) ^ (b & ~a)
 
  a ^ b == (a & ~b) + (b & ~a)
 
  a ^ b == (a & ~b) + (b & ~a)
'''Toggle'''  
+
 
 +
'''Toggle''' <br/>
 
Xor can be used to toggle or flip bits by a mask.
 
Xor can be used to toggle or flip bits by a mask.
 
  x ^= mask;
 
  x ^= mask;
'''Complement'''  
+
 
 +
'''Complement''' <br/>
 
xor with the universal set -1 flips each bit and results in the ones' complement.
 
xor with the universal set -1 flips each bit and results in the ones' complement.
 
  a ^ -1 == ~a
 
  a ^ -1 == ~a
 
<span id="XorWithout"></span>
 
<span id="XorWithout"></span>
'''Without'''  
+
'''Without''' <br/>
 
Due to distributive law and since symmetric difference of set and subset is the relative complement of subset in set, there are some equivalent ways to calculate the [[General Setwise Operations#RelativeComplement|relative complement]] by xor. Based on surrounding expressions or whether subexpressions such as union, intersection or symmetric difference may be reused one may prefer the one or other alternative.
 
Due to distributive law and since symmetric difference of set and subset is the relative complement of subset in set, there are some equivalent ways to calculate the [[General Setwise Operations#RelativeComplement|relative complement]] by xor. Based on surrounding expressions or whether subexpressions such as union, intersection or symmetric difference may be reused one may prefer the one or other alternative.
 
  a & ~b == a & (-1 ^ b )
 
  a & ~b == a & (-1 ^ b )
Line 529: Line 529:
 
  a & ~b == a ^ ( a & b ) == a - ( a & b )
 
  a & ~b == a ^ ( a & b ) == a - ( a & b )
 
  a & ~b == b ^ ( a | b ) == ( a | b ) - b
 
  a & ~b == b ^ ( a | b ) == ( a | b ) - b
Also note that
+
Also note that<br/>
 
  a & a == a & -1
 
  a & a == a & -1
 
<span id="XorClear"></span>
 
<span id="XorClear"></span>
'''Clear'''  
+
'''Clear'''<br/>
 
Since 'a' xor 'a' is zero, it is the shorter opcode to clear a register, since it takes no immediate operand. Applied by optimizing compilers. Same is true for subtraction by the way.
 
Since 'a' xor 'a' is zero, it is the shorter opcode to clear a register, since it takes no immediate operand. Applied by optimizing compilers. Same is true for subtraction by the way.
 
<pre>
 
<pre>
Line 539: Line 539:
 
pxor xmm0, xmm0 ; SSE2 - 128-bit xmm-register
 
pxor xmm0, xmm0 ; SSE2 - 128-bit xmm-register
 
</pre>
 
</pre>
'''Xor Swap'''  
+
'''Xor Swap'''<br/>
 
Three xors on the same registers swap their content: (Note: this only works when a and b are stored on distinct memory adresses!)
 
Three xors on the same registers swap their content: (Note: this only works when a and b are stored on distinct memory adresses!)
 
<pre>
 
<pre>
Line 554: Line 554:
 
... 'a' becomes 'b', but only a part of 'b', where mask is one, becomes 'a'.
 
... 'a' becomes 'b', but only a part of 'b', where mask is one, becomes 'a'.
 
<span id="BitsFrom2SourcesByMask"></span>
 
<span id="BitsFrom2SourcesByMask"></span>
'''Bits from two Sources'''  
+
'''Bits from two Sources'''<br/>
 
Getting arbitrary, [https://en.wikipedia.org/wiki/Disjoint disjoint] bits from two sources by a mask:
 
Getting arbitrary, [https://en.wikipedia.org/wiki/Disjoint disjoint] bits from two sources by a mask:
 
<pre>
 
<pre>
Line 568: Line 568:
 
== (a & ~mask) + (b & mask) because both sets of the union are disjoint
 
== (a & ~mask) + (b & mask) because both sets of the union are disjoint
 
</pre>
 
</pre>
'''XOR-applications and affairs'''  
+
'''XOR-applications and affairs'''<br/>
 
* Calculation of hash-keys based on [[Zobrist Hashing|Zobrist-keys]].
 
* Calculation of hash-keys based on [[Zobrist Hashing|Zobrist-keys]].
 
* [https://en.wikipedia.org/wiki/Cyclic_redundancy_check Cyclic redundancy check], [[Parallel Prefix Algorithms#ParityWords|Parity words]] or [[Parallel Prefix Algorithms#GrayCode|Gray Code]]
 
* [https://en.wikipedia.org/wiki/Cyclic_redundancy_check Cyclic redundancy check], [[Parallel Prefix Algorithms#ParityWords|Parity words]] or [[Parallel Prefix Algorithms#GrayCode|Gray Code]]
 
* [https://en.wikipedia.org/wiki/Fredkin_gate Fredkin gate] by [[Edward Fredkin]]
 
* [https://en.wikipedia.org/wiki/Fredkin_gate Fredkin gate] by [[Edward Fredkin]]
 
* [[Hyperbola Quintessence]].
 
* [[Hyperbola Quintessence]].
* [[Subtracting a rook from a blocking piece|o^(o-2r)]].
+
* [[Subtracting a Rook from a Blocking Piece|o^(o-2r)]].
 
* [[Robert Hyatt|Robert Hyatt's]] approach of a [[Shared Hash Table#Lockless|lockless transposition table]]
 
* [[Robert Hyatt|Robert Hyatt's]] approach of a [[Shared Hash Table#Lockless|lockless transposition table]]
 
* [[General Setwise Operations#SwappingBits|Swapping Bits]].
 
* [[General Setwise Operations#SwappingBits|Swapping Bits]].
Line 579: Line 579:
  
 
==Equivalence==  
 
==Equivalence==  
[[FILE:Venn1001.svg|none|border|text-bottom|240px|link=https://en.wikipedia.org/wiki/Venn_diagram]]
+
[[FILE:Venn1001.svg|border|right|thumb|240px|Equivalence]]
 
+
[https://en.wikipedia.org/wiki/If_and_only_if If and only if] (Iff) is denoted as:
[https://en.wikipedia.org/wiki/If_and_only_if If and only if] is denoted as:
+
  <span style="font-size: 140%;">A &#8660; B</span>
  A &#8660; B
 
 
[https://en.wikipedia.org/wiki/Logical_equivalence Logical equivalence] is denoted as:
 
[https://en.wikipedia.org/wiki/Logical_equivalence Logical equivalence] is denoted as:
  a &#8596; b
+
  <span style="font-size: 140%;">a &#8596; b</span>
  
 
[https://en.wikipedia.org/wiki/Logical_equality Logical equality], [https://en.wikipedia.org/wiki/Logical_equivalence logical equivalence] or [https://en.wikipedia.org/wiki/Logical_biconditional biconditional] ([https://en.wikipedia.org/wiki/If_and_only_if if and only if], [https://en.wikipedia.org/wiki/XNOR_gate XNOR] ) is the complement of xor.
 
[https://en.wikipedia.org/wiki/Logical_equality Logical equality], [https://en.wikipedia.org/wiki/Logical_equivalence logical equivalence] or [https://en.wikipedia.org/wiki/Logical_biconditional biconditional] ([https://en.wikipedia.org/wiki/If_and_only_if if and only if], [https://en.wikipedia.org/wiki/XNOR_gate XNOR] ) is the complement of xor.
Line 590: Line 589:
 
  a_equal_b ==  (a & b) | (~a & ~b)
 
  a_equal_b ==  (a & b) | (~a & ~b)
 
  a_equal_b ==  (a & b) | ~(a | b)
 
  a_equal_b ==  (a & b) | ~(a | b)
'''Truth Table'''  
+
'''Truth Table''' <br/>
 
Truth table of equivalence or for one bit:
 
Truth table of equivalence or for one bit:
 
{| class="wikitable"
 
{| class="wikitable"
Line 616: Line 615:
 
Equivalence implements a bitwise (a == b) relation.
 
Equivalence implements a bitwise (a == b) relation.
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]
+
[[AVX-512]] has [[AVX-512#VPTERNLOG|VPTERNLOG]]<br/>
 
<span id="Majority"></span>
 
<span id="Majority"></span>
 
==Majority==  
 
==Majority==  
 
The [https://en.wikipedia.org/wiki/Majority_function majority function] or '''median operator''' is a function from n inputs to one output. The value of the operation is false when n/2 or fewer arguments are false, and true otherwise. For two inputs it is the [[General Setwise Operations#Intersection|intersection]]. Three inputs require some more computation:
 
The [https://en.wikipedia.org/wiki/Majority_function majority function] or '''median operator''' is a function from n inputs to one output. The value of the operation is false when n/2 or fewer arguments are false, and true otherwise. For two inputs it is the [[General Setwise Operations#Intersection|intersection]]. Three inputs require some more computation:
  
'''Truth Table'''  
+
'''Truth Table'''<br/> 
 
Truth table of majority for three inputs:
 
Truth table of majority for three inputs:
 
{| class="wikitable"
 
{| class="wikitable"
Line 674: Line 673:
 
See the application of [[Population Count#CardinalityofMultipleSets|cardinality of multiple sets]] for more than three inputs.
 
See the application of [[Population Count#CardinalityofMultipleSets|cardinality of multiple sets]] for more than three inputs.
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
[[AVX-512]] [[AVX-512#VPTERNLOG|VPTERNLOG]] imm8 = 0xe8 implements the majority function.
 
[[AVX-512]] [[AVX-512#VPTERNLOG|VPTERNLOG]] imm8 = 0xe8 implements the majority function.
 
<span id="GreaterOne"></span>
 
<span id="GreaterOne"></span>
 
==Greater One Sets==  
 
==Greater One Sets==  
 
'''Greater One''' is a function from n inputs to one output. The value of the operation is true if more than one argument is true, false otherwise. Obviously, for two inputs it is the [[General Setwise Operations#Intersection|intersection]], for three inputs it is the [[General Setwise Operations#Majority|majority function]]. For more inputs it is the union of all distinct pairwise intersections, which can be expressed with setwise operators that way:
 
'''Greater One''' is a function from n inputs to one output. The value of the operation is true if more than one argument is true, false otherwise. Obviously, for two inputs it is the [[General Setwise Operations#Intersection|intersection]], for three inputs it is the [[General Setwise Operations#Majority|majority function]]. For more inputs it is the union of all distinct pairwise intersections, which can be expressed with setwise operators that way:
  <span style="font-size: 200%;">&#8746;</span><span style="vertical-align: sub;">i>j&#8712;I</span><span style="font-size: 120%;">(A<span style="vertical-align: sub;">i</span>&#8745;A<span style="vertical-align: sub;">j</span>)</span>
+
  <span style="font-size: 200%;">&#8746;</span><span style="vertical-align: sub;">i>j&#8712;I</span><span style="font-size: 120%;">(A<span style="vertical-align: sub;">i</span> &#8745; A<span style="vertical-align: sub;">j</span>)</span>
 
With four bitboards this is equivalent to:
 
With four bitboards this is equivalent to:
 
<pre>
 
<pre>
Line 696: Line 695:
 
operations - that is 11 for n == 4.
 
operations - that is 11 for n == 4.
  
'''O(n^2) to O(n)'''  
+
'''O(n^2) to O(n)''' <br/>
 
Due to [[General Setwise Operations#DistributiveAndOr|distibutive law]] one can factor out common sets ...
 
Due to [[General Setwise Operations#DistributiveAndOr|distibutive law]] one can factor out common sets ...
 
<pre>
 
<pre>
Line 705: Line 704:
 
... with further reductions of the number of operations, also due to aggregation of the inner or-terms. Three additional operations for an increment of n, thus the former quadratic increase becomes linear.
 
... with further reductions of the number of operations, also due to aggregation of the inner or-terms. Three additional operations for an increment of n, thus the former quadratic increase becomes linear.
  
In general, as mentioned,
+
In general, as mentioned,<br/>
  <span style="font-size: 200%;">&#8746;</span><span style="vertical-align: sub;">i>j&#8712;I</span><span style="font-size: 120%;">(A<span style="vertical-align: sub;">i</span>&#8745;A<span style="vertical-align: sub;">j</span>)</span>
+
  <span style="font-size: 200%;">&#8746;</span><span style="vertical-align: sub;">i>j&#8712;I</span><span style="font-size: 120%;">(A<span style="vertical-align: sub;">i</span> &#8745; A<span style="vertical-align: sub;">j</span>)</span>
 
requires
 
requires
 
  n * (n - 1) - 1
 
  n * (n - 1) - 1
Line 750: Line 749:
 
The reason the bitboard type-definintion is unsigned in [[C]], [[Cpp|C++]] is to avoid so called [https://en.wikipedia.org/wiki/Arithmetic_shift arithmetical shift right] in opposition to [https://en.wikipedia.org/wiki/Logical_shift logical shift right] . Arithmetical shift right implies filling one-bits in from MSB-direction if the operand is negative and has MSB bit 63 set. Logical shift right always shifts in zeros - that is what we need. [[Java]] has no unsigned types, but a special unsigned shift right operator >>>.
 
The reason the bitboard type-definintion is unsigned in [[C]], [[Cpp|C++]] is to avoid so called [https://en.wikipedia.org/wiki/Arithmetic_shift arithmetical shift right] in opposition to [https://en.wikipedia.org/wiki/Logical_shift logical shift right] . Arithmetical shift right implies filling one-bits in from MSB-direction if the operand is negative and has MSB bit 63 set. Logical shift right always shifts in zeros - that is what we need. [[Java]] has no unsigned types, but a special unsigned shift right operator >>>.
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
[[x86]] has general purpose instruction as well as [[SIMD and SWAR Techniques|SIMD-instructions]] for various shifts:
+
[[x86]] has general purpose instructions, [[BMI2]] general purpose instructions not affecting processor flags, as well as [[SIMD and SWAR Techniques|SIMD-instructions]] for various shifts:
 
<pre>
 
<pre>
 
shr      rax,  cl        ;      rax >>= cl
 
shr      rax,  cl        ;      rax >>= cl
 
shl      rax,  cl        ;      rax <<= cl
 
shl      rax,  cl        ;      rax <<= cl
 +
shrx    r64a, r/m64, r64b; BMI2  r64a = r/m64 >> r64b
 +
shlx    r64a, r/m64, r64b; BMI2  r64a = r/m64 << r64b
 
psrlq    mm0,  mm1        ; MMX  mm0 >>= mm1
 
psrlq    mm0,  mm1        ; MMX  mm0 >>= mm1
 
psllq    mm0,  mm1        ; MMX  mm0 <<= mm1
 
psllq    mm0,  mm1        ; MMX  mm0 <<= mm1
 
psrlq    xmm0, xmm1      ; SSE2 xmm0 >>= xmm1
 
psrlq    xmm0, xmm1      ; SSE2 xmm0 >>= xmm1
 
psllq    xmm0, xmm1      ; SSE2 xmm0 <<= xmm1
 
psllq    xmm0, xmm1      ; SSE2 xmm0 <<= xmm1
vpshlq  xmm0, xmm1, xmm2 ; XOP  xmm0  = xmm1 >>/<< xmm2 ; Individual, generalized shifts
 
vpshlb  xmm0, xmm1, xmm2 ; XOP  xmm0  = xmm1 >>/<< xmm2 ; Individual, generalized shifts of 16 bytes
 
 
vpsrlvq  ymm0, ymm1, ymm2 ; AVX2 ymm0  = ymm1 >> ymm2 ; Individual shifts
 
vpsrlvq  ymm0, ymm1, ymm2 ; AVX2 ymm0  = ymm1 >> ymm2 ; Individual shifts
 
vpsllvq  ymm0, ymm1, ymm2 ; AVX2 ymm0  = ymm1 << ymm2 ; Individual shifts
 
vpsllvq  ymm0, ymm1, ymm2 ; AVX2 ymm0  = ymm1 << ymm2 ; Individual shifts
 
</pre>
 
</pre>
 
[[SSE2]]-intrinsics with variable register or constant immediate shift amounts, working on vectors of two bitboards:
 
[[SSE2]]-intrinsics with variable register or constant immediate shift amounts, working on vectors of two bitboards:
* [http://msdn2.microsoft.com/en-us/library/yf6cf9k8%28VS.80%29.aspx _mm_srl_epi64]
+
* [[SSE2#_mm_srl_epi64|_mm_srl_epi64]]
* [http://msdn2.microsoft.com/en-us/library/btdyeyt1%28VS.80%29.aspx _mm_srli_epi64]
+
* [[SSE2#_mm_srli_epi64|_mm_srli_epi64]]
* [http://msdn2.microsoft.com/en-us/library/6ta9dffd%28VS.80%29.aspx _mm_sll_epi64]
+
* [[SSE2#_mm_sll_epi64|_mm_sll_epi64]]
* [http://msdn2.microsoft.com/en-us/library/da6131h7%28VS.80%29.aspx _mm_slli_epi64]
+
* [[SSE2#_mm_slli_epi64|_mm_slli_epi64]]
 
 
[[XOP]] has [[XOP#Shifts|individual, generalized shifts]] for each of two bitboards and also has byte-wise shifts
 
* [http://msdn.microsoft.com/en-us/library/gg466456 _mm_shl_epi64]
 
* [http://msdn.microsoft.com/en-us/library/gg466458 _mm_shl_epi8]
 
  
 
[[AVX2]] has [[AVX2#IndividualShifts|individual shifts]] for each of four bitboards:
 
[[AVX2]] has [[AVX2#IndividualShifts|individual shifts]] for each of four bitboards:
* [https://software.intel.com/en-us/node/695097 _mm256_sllv_epi64]
+
* [[AVX2#_mm256_sllv_epi64|_mm256_sllv_epi64]]
* [https://software.intel.com/en-us/node/695103 _mm256_srlv_epi64]
+
* [[AVX2#_mm256_srlv_epi64|_mm256_srlv_epi64]]
 
<span id="OneStepOnly"></span>
 
<span id="OneStepOnly"></span>
 
==One Step Only==  
 
==One Step Only==  
Line 831: Line 826:
 
</pre>
 
</pre>
  
Most processors have rotate instructions, but are not supported by standard programming languages like [[C]] or [[Java]]. Some compilers provide [http://msdn2.microsoft.com/en-us/library/5cc576c4.aspx intrinsic], processor specific functions.
+
Most processors have rotate instructions, but are not supported by standard programming languages like [[C]] or [[Java]]. Some compilers provide intrinsic, processor specific functions.
 
<pre>
 
<pre>
 
U64 rotateLeft (U64 x, int s) {return _rotl64(x, s);}
 
U64 rotateLeft (U64 x, int s) {return _rotl64(x, s);}
Line 841: Line 836:
 
ror  rax, cl
 
ror  rax, cl
 
</pre>
 
</pre>
'''Rotate by Shift'''  
+
'''Rotate by Shift'''<br/>
 
Otherwise rotate has to be emulated by shifts, with some chance optimizing compiler will emit exactly one rotate instruction.
 
Otherwise rotate has to be emulated by shifts, with some chance optimizing compiler will emit exactly one rotate instruction.
 
<pre>
 
<pre>
Line 889: Line 884:
 
</pre>
 
</pre>
 
<span id="GenOneStep"></span>
 
<span id="GenOneStep"></span>
'''One Step'''  
+
'''One Step'''<br/>
 
[[x86-64]] rot64 works like a generalized shift with positive or negative shift amount - since it internally applies an unsigned modulo 64 ( & 63) and makes -i = 64-i. We need to clear either the lower or upper bits by intersection with a mask, which might be combined with the wrap-ands for [[General Setwise Operations#OneStepOnly|one step]]. It might be applied to get attacks for both sides with a [[Direction|direction]] parameter and small lookups for shift amount and wrap-ands - instead of multiple code for eight directions. Of course generalized shift will be a bit slower due to lookups and using cl as the shift amount register.
 
[[x86-64]] rot64 works like a generalized shift with positive or negative shift amount - since it internally applies an unsigned modulo 64 ( & 63) and makes -i = 64-i. We need to clear either the lower or upper bits by intersection with a mask, which might be combined with the wrap-ands for [[General Setwise Operations#OneStepOnly|one step]]. It might be applied to get attacks for both sides with a [[Direction|direction]] parameter and small lookups for shift amount and wrap-ands - instead of multiple code for eight directions. Of course generalized shift will be a bit slower due to lookups and using cl as the shift amount register.
 
<pre>
 
<pre>
Line 957: Line 952:
 
''The inverse function square = log2(x), is topic of [[BitScan|bitscan]] and [[Bitboard Serialization|bitboard serialization]].''
 
''The inverse function square = log2(x), is topic of [[BitScan|bitscan]] and [[Bitboard Serialization|bitboard serialization]].''
  
'''Shift versus Lookup'''  
+
'''Shift versus Lookup''' <br/>
 
While 1 << square sounds cheap, it is rather expensive in 32-bit mode - and therefor often precalculated in a small lookup-table of 64-single bit bitboards. Also, on [[x86-64]]-processors a variable shift is restricted to the byte-register cl. Thus, two or more variable shifts are constrained by sequential execution <ref>[http://www.open-chess.org/viewtopic.php?f=5&t=2878 To shift or not to shift] by thevinenator, [[Computer Chess Forums|OpenChess Forum]], September 09, 2015</ref>.
 
While 1 << square sounds cheap, it is rather expensive in 32-bit mode - and therefor often precalculated in a small lookup-table of 64-single bit bitboards. Also, on [[x86-64]]-processors a variable shift is restricted to the byte-register cl. Thus, two or more variable shifts are constrained by sequential execution <ref>[http://www.open-chess.org/viewtopic.php?f=5&t=2878 To shift or not to shift] by thevinenator, [[Computer Chess Forums|OpenChess Forum]], September 09, 2015</ref>.
  
'''Test'''  
+
'''Test''' <br/>
 
Test a bit of a square-index by [[General Setwise Operations#Intersection|intersection]]-operator 'and'.
 
Test a bit of a square-index by [[General Setwise Operations#Intersection|intersection]]-operator 'and'.
 
<pre>
 
<pre>
 
if (x & singleBitset) -> bit is set;
 
if (x & singleBitset) -> bit is set;
 
</pre>
 
</pre>
'''Set'''  
+
'''Set''' <br/>
 
Set a bit of a square-index by [[General Setwise Operations#Union|union]]-operator 'or'.
 
Set a bit of a square-index by [[General Setwise Operations#Union|union]]-operator 'or'.
 
<pre>
 
<pre>
 
x |=  singleBitset; // set bit
 
x |=  singleBitset; // set bit
 
</pre>
 
</pre>
'''Toggle'''  
+
'''Toggle''' <br/>
 
Toggle a bit of square-index by [[General Setwise Operations#ExclusiveOr|xor]].
 
Toggle a bit of square-index by [[General Setwise Operations#ExclusiveOr|xor]].
 
<pre>
 
<pre>
 
x ^=  singleBitset; // toggle bit
 
x ^=  singleBitset; // toggle bit
 
</pre>
 
</pre>
'''Reset'''  
+
'''Reset''' <br/>
Reset a bit of square-index by [[General Setwise Operations#RelativeComplement|relative complement]] of the single bit.
+
<span id="ResetBit"></span>Reset a bit of square-index by [[General Setwise Operations#RelativeComplement|relative complement]] of the single bit,
 
<pre>
 
<pre>
 
x &= ~singleBitset; // reset bit
 
x &= ~singleBitset; // reset bit
 +
</pre>
 +
or conditional toggle by single bit intersection
 +
<pre>
 +
x ^=  singleBitset & x; // reset bit
 
</pre>
 
</pre>
 
Set and toggle (or, xor) might the faster way to reset a bit inside a register (not, and).
 
Set and toggle (or, xor) might the faster way to reset a bit inside a register (not, and).
Line 987: Line 986:
 
If singleBitset needs to preserved, an extra register is needed for the complement.
 
If singleBitset needs to preserved, an extra register is needed for the complement.
  
'''x86-Instructions'''  
+
'''x86-Instructions''' <br/>
[[x86]] processor provides a bit-test instruction family (bt, bts, btr, btc) with 32- and 64-bit operands. They may be used implicitly by compiler optimization or explicitly by inline assembler or compiler intrinsics. Take care that they are applied on local variables likely registers rather than memory references:
+
[[x86]] processor provides a bit-test instruction family (bt, bts, btr, btc) with 32- and 64-bit operands. They may be used implicitly by compiler optimization or explicitly by inline assembler or compiler intrinsics. Take care that they are applied on local variables likely registers rather than memory references <ref>[https://groups.google.com/forum/#!topic/fishcooking/TplkvO62I9U On the speed of SquareBB array] by protonspring, [[Computer Chess Forums|FishCooking]], March 22, 2019</ref>:
* [http://msdn2.microsoft.com/en-us/library/h65k4tze%28VS.80%29.aspx _bittest64]
+
* [[x86-64#gpinstructions|_bittest64]]
* [http://msdn2.microsoft.com/en-us/library/z56sc6y4%28VS.80%29.aspx _bittestandset64]
+
* [[x86-64#gpinstructions|_bittestandset64]]
* [http://msdn2.microsoft.com/en-us/library/zbdxdb11%28VS.80%29.aspx _bittestandcomplement64]
+
* [[x86-64#gpinstructions|_bittestandcomplement64]]
* [http://msdn2.microsoft.com/en-us/library/hd0hzyf8%28VS.80%29.aspx _bittestandreset64]
+
* [[x86-64#gpinstructions|_bittestandreset64]]
 
<span id="UpdateByMove"></span>
 
<span id="UpdateByMove"></span>
 
==Update by Move==  
 
==Update by Move==  
Line 1,023: Line 1,022:
 
Similar for special moves like [[Castling|castling]], [[Promotions|promotions]] and [[En passant|en passant captures]].
 
Similar for special moves like [[Castling|castling]], [[Promotions|promotions]] and [[En passant|en passant captures]].
 
<span id="UpperAndLowerBits"></span>
 
<span id="UpperAndLowerBits"></span>
'''Upper Squares'''  
+
'''Upper Squares''' <br/>
 
To get a set of all upper squares or bits, either shift ~1 or -2 left by square:
 
To get a set of all upper squares or bits, either shift ~1 or -2 left by square:
 
<pre>
 
<pre>
Line 1,041: Line 1,040:
 
</pre>
 
</pre>
  
'''Lower Squares'''  
+
'''Lower Squares''' <br/>
 
Lower squares are simply Bit by Square minus one.
 
Lower squares are simply Bit by Square minus one.
 
<pre>
 
<pre>
Line 1,062: Line 1,061:
 
[http://graphics.stanford.edu/%7Eseander/bithacks.html#SwappingBitsXOR Swapping] none overlapping bit-sequences in a bitboard is the base of a lot of [https://en.wikipedia.org/wiki/Permutation permutation] tricks.
 
[http://graphics.stanford.edu/%7Eseander/bithacks.html#SwappingBitsXOR Swapping] none overlapping bit-sequences in a bitboard is the base of a lot of [https://en.wikipedia.org/wiki/Permutation permutation] tricks.
  
'''by Position'''  
+
'''by Position'''<br/>
 
Suppose we like to swap n [[Bit|bits]] from two none overlapping bit locations of a bitboard. The trick is to set all n least significant bits by subtracting one from n power of 2. Both substrings are shifted to bit zero, exclusive ored and masked by the n ones. This sequence is then twice shifted back to their original places, while the [[General Setwise Operations#Union|union]] (xor-union due to [https://en.wikipedia.org/wiki/Disjoint disjoint] bits) is finally exclusive ored with the original bitboard to swap both sequences.
 
Suppose we like to swap n [[Bit|bits]] from two none overlapping bit locations of a bitboard. The trick is to set all n least significant bits by subtracting one from n power of 2. Both substrings are shifted to bit zero, exclusive ored and masked by the n ones. This sequence is then twice shifted back to their original places, while the [[General Setwise Operations#Union|union]] (xor-union due to [https://en.wikipedia.org/wiki/Disjoint disjoint] bits) is finally exclusive ored with the original bitboard to swap both sequences.
 
<pre>
 
<pre>
Line 1,111: Line 1,110:
 
</pre>
 
</pre>
 
<span id="DeltaSwap"></span>
 
<span id="DeltaSwap"></span>
'''Delta Swap'''  
+
'''Delta Swap''' <br/>
 
To swap any none overlapping pairs we can shift by the difference (j-i, with j>i) and supply an explicit mask with a '1' on the least significant position for each pair supposed to be swapped.
 
To swap any none overlapping pairs we can shift by the difference (j-i, with j>i) and supply an explicit mask with a '1' on the least significant position for each pair supposed to be swapped.
 
<pre>
 
<pre>
Line 1,143: Line 1,142:
 
<span id="ArithmeticalOperations"></span>
 
<span id="ArithmeticalOperations"></span>
 
=Arithmetic Operations=  
 
=Arithmetic Operations=  
At the first glance, [https://en.wikipedia.org/wiki/Arithmetic#Arithmetic_operations arithmetic operations], that is [https://en.wikipedia.org/wiki/Addition addition], [https://en.wikipedia.org/wiki/Subtraction subtraction], [https://en.wikipedia.org/wiki/Multiplication multiplication] and [https://en.wikipedia.org/wiki/Division_%28mathematics%29 division], doesn't make much sense with bitboards. Still, there are some [[Bit-Twiddling|bit-twiddling]] applications related to least significant one bit (LS1B), to [[Traversing Subsets of a Set|enumerate all subsets of a set]] or [[Subtracting a rook from a blocking piece|sliding attack generation]]. Multiplication of certain pattern has some applications as well, most likely to calculate hash-indicies of [[Occupancy of any Line#Multiplication|masked occupancies]].
+
At the first glance, [https://en.wikipedia.org/wiki/Arithmetic#Arithmetic_operations arithmetic operations], that is [https://en.wikipedia.org/wiki/Addition addition], [https://en.wikipedia.org/wiki/Subtraction subtraction], [https://en.wikipedia.org/wiki/Multiplication multiplication] and [https://en.wikipedia.org/wiki/Division_%28mathematics%29 division], doesn't make much sense with bitboards. Still, there are some [[Bit-Twiddling|bit-twiddling]] applications related to least significant one bit (LS1B), to [[Traversing Subsets of a Set|enumerate all subsets of a set]] or [[Subtracting a Rook from a Blocking Piece|sliding attack generation]]. Multiplication of certain pattern has some applications as well, most likely to calculate hash-indicies of [[Occupancy of any Line#Multiplication|masked occupancies]].
  
 
==Derived from Bitwise==
 
==Derived from Bitwise==
 +
[[FILE:Half Adder.svg|border|right|thumb|link=https://en.wikipedia.org/wiki/Half_adder#Half_adder|Half Adder]]
 
Unlike bitwise boolean operations on 64-bit words, which are in fact 64 parallel operations on each bit without any interaction between them, arithmetic operations like addition need to propagate possible [https://en.wikipedia.org/wiki/Carry_%28arithmetic%29 carries] from lower to higher bits. Despite, Add and Sub are usually as fast their bitwise boolean counterparts, because they are implemented in Hardware within the [[Combinatorial Logic#ALU|ALU]] of the CPU. A so called [[Combinatorial Logic#Adder|half-adder]] to add two bits (A, B), requires an [[Combinatorial Logic#AND|And-Gate]] for the carry (C) and a [[Combinatorial Logic#XOR|Xor-Gate]] for the sum (S):
 
Unlike bitwise boolean operations on 64-bit words, which are in fact 64 parallel operations on each bit without any interaction between them, arithmetic operations like addition need to propagate possible [https://en.wikipedia.org/wiki/Carry_%28arithmetic%29 carries] from lower to higher bits. Despite, Add and Sub are usually as fast their bitwise boolean counterparts, because they are implemented in Hardware within the [[Combinatorial Logic#ALU|ALU]] of the CPU. A so called [[Combinatorial Logic#Adder|half-adder]] to add two bits (A, B), requires an [[Combinatorial Logic#AND|And-Gate]] for the carry (C) and a [[Combinatorial Logic#XOR|Xor-Gate]] for the sum (S):
 
[[FILE:Half Adder.svg|none|border|text-bottom|link=https://en.wikipedia.org/wiki/Half_adder#Half_adder]]
 
 
<pre>
 
<pre>
 
two_bitsum = (bitA ^ bitB) | ((bitA & bitB) << 1);
 
two_bitsum = (bitA ^ bitB) | ((bitA & bitB) << 1);
Line 1,192: Line 1,190:
 
<span id="Subtraction"></span>
 
<span id="Subtraction"></span>
 
==Subtraction==  
 
==Subtraction==  
[https://en.wikipedia.org/wiki/Subtraction Subtraction] (like xor) might be used to implement the [[General Setwise Operations#RelativeComplement|relative complement]], of a [https://en.wikipedia.org/wiki/Subset subset] inside it's superset. As mentioned, subtraction may be useful in calculating [[Subtracting a rook from a blocking piece|sliding attacks]].
+
[https://en.wikipedia.org/wiki/Subtraction Subtraction] (like xor) might be used to implement the [[General Setwise Operations#RelativeComplement|relative complement]], of a [https://en.wikipedia.org/wiki/Subset subset] inside it's superset. As mentioned, subtraction may be useful in calculating [[Subtracting a Rook from a Blocking Piece|sliding attacks]].
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
<pre>
 
<pre>
 
sub  rax, rbx ; rax -= rbx
 
sub  rax, rbx ; rax -= rbx
Line 1,202: Line 1,200:
 
A lot of [[Bit-Twiddling|bit-twiddling]] tricks on bitboards to traverse or isolate subsets, rely on [https://en.wikipedia.org/wiki/Two%27s_complement two's complement] arithmetic. Most recent processors (and compiler or interpreter for these processors) use the two's complement to implement the unary minus operator for signed as well for unsigned integer types. In [[C]] it is guaranteed for unsigned integer types. [[Java]] guarantees two's complement for all implicit signed integral types char, short, int, long.
 
A lot of [[Bit-Twiddling|bit-twiddling]] tricks on bitboards to traverse or isolate subsets, rely on [https://en.wikipedia.org/wiki/Two%27s_complement two's complement] arithmetic. Most recent processors (and compiler or interpreter for these processors) use the two's complement to implement the unary minus operator for signed as well for unsigned integer types. In [[C]] it is guaranteed for unsigned integer types. [[Java]] guarantees two's complement for all implicit signed integral types char, short, int, long.
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
<pre>
 
<pre>
 
neg  rax;  rax = -rax; rax *= -1
 
neg  rax;  rax = -rax; rax *= -1
 
</pre>
 
</pre>
  
''2^N is used as power operator in this paragraph not xor !''
+
''2^N is used as power operator in this paragraph not xor'' !
  
'''Increment of Complement'''  
+
'''Increment of Complement'''<br/>
The two's complement is defined as a value, we need to add to the original value to get 2^64 which is an "overflowed" zero - since all 64-bit values are implicitly modulo 2^64. Thus, the two's complement is defined as '''ones' complement plus one''':
+
The two's complement is defined as a value, we need to add to the original value to get 2<span style="vertical-align: super;">64</span> which is an "overflowed" zero - since all 64-bit values are implicitly modulo 2<span style="vertical-align: super;">64</span>. Thus, the two's complement is defined as ''ones' complement plus one'':
<pre>
+
-x == ~x + 1
-x == ~x + 1
+
That fulfills the condition that x + (-x) == 2<span style="vertical-align: super;">bitsize</span> (2<span style="vertical-align: super;">64</span>) which overflows to zero:
</pre>
 
That fulfills the condition that x + (-x) == 2 ^ bitsize (2 ^ 64) which overflows to zero:
 
 
<pre>
 
<pre>
 
x + (-x)    == 0
 
x + (-x)    == 0
Line 1,220: Line 1,216:
 
==>  x + ~x == -1 the universal set
 
==>  x + ~x == -1 the universal set
 
</pre>
 
</pre>
'''Complement of Decrement'''  
+
'''Complement of Decrement'''<br/>
 
Replacing x by x - 1 in the increment of complement formula, leaves another definition - two's complement or Negation is also the ones' complement of the ones' decrement:
 
Replacing x by x - 1 in the increment of complement formula, leaves another definition - two's complement or Negation is also the ones' complement of the ones' decrement:
<pre>
+
-x == ~(x - 1)
-x == ~(x - 1)
 
</pre>
 
 
Thus, we can reduce subtraction by addition and ones' complement:
 
Thus, we can reduce subtraction by addition and ones' complement:
 
<pre>
 
<pre>
Line 1,230: Line 1,224:
 
   x - y  == ~(~x + y)
 
   x - y  == ~(~x + y)
 
</pre>
 
</pre>
 
+
'''Bitwise Copy/Invert'''<br/>
'''Bitwise Copy/Invert'''  
 
 
The two's complement may also defined by a bitwise copy-loop from right (LSB) to left (MSB):
 
The two's complement may also defined by a bitwise copy-loop from right (LSB) to left (MSB):
 
<pre>
 
<pre>
Line 1,239: Line 1,232:
 
</pre>
 
</pre>
  
'''Signed-Unsigned'''  
+
'''Signed-Unsigned''' <br/>
 
This works independently whether we interpret 'x' as signed or unsigned. While 0 is is the synonym for all bits clear, -1 is the synonym for all bits set in a computer word of any arbitrary bit-size, also for 64-bit words such as bitboards.
 
This works independently whether we interpret 'x' as signed or unsigned. While 0 is is the synonym for all bits clear, -1 is the synonym for all bits set in a computer word of any arbitrary bit-size, also for 64-bit words such as bitboards.
  
Line 1,302: Line 1,295:
 
Some C++ compiler warn -x still unsigned - (0-x) may used to avoid that with no overhead.
 
Some C++ compiler warn -x still unsigned - (0-x) may used to avoid that with no overhead.
  
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
[[x86-64]] expansion [[BMI1]] has LS1B bit isolation:
 
[[x86-64]] expansion [[BMI1]] has LS1B bit isolation:
 
<pre>
 
<pre>
 
blsi  rax, rbx ; BMI1  rax = rbx & -rbx  
 
blsi  rax, rbx ; BMI1  rax = rbx & -rbx  
 
</pre>
 
</pre>
[[BMI1]]-intrinsic [http://www.felixcloutier.com/x86/BLSI.html _blsi_u32/64].
+
[[BMI1]]-intrinsic [[BMI1#_blsi_u3264|_blsi_u32/64]]
  
 
[[AMD|AMD's]] [[x86-64]] expansion [[TBM]] further has a [[TBM#BLSIC|Isolate Lowest Set Bit and Complement]] instruction, which applies [[General Setwise Operations#DeMorganslaws|De Morgan's law]] to get the complement of the LS1B:
 
[[AMD|AMD's]] [[x86-64]] expansion [[TBM]] further has a [[TBM#BLSIC|Isolate Lowest Set Bit and Complement]] instruction, which applies [[General Setwise Operations#DeMorganslaws|De Morgan's law]] to get the complement of the LS1B:
Line 1,333: Line 1,326:
 
... since we already know two's complement (-x) and ones' decrement (x-1) are complement sets.
 
... since we already know two's complement (-x) and ones' decrement (x-1) are complement sets.
  
'''x86-mnemonics'''  
+
'''x86-mnemonics'''<br/>
 
[[x86-64]] expansion [[BMI1]] has LS1B bit reset:
 
[[x86-64]] expansion [[BMI1]] has LS1B bit reset:
 
<pre>
 
<pre>
 
blsr  rax, rbx ; BMI1  rax = rbx & (rbx - 1)
 
blsr  rax, rbx ; BMI1  rax = rbx & (rbx - 1)
 
</pre>
 
</pre>
[[BMI1]]-intrinsic [http://www.felixcloutier.com/x86/BLSR.html _blsr_u32/64].
+
[[BMI1]]-intrinsic [[BMI1#_blsr_u3264|_blsr_u32/64]]
 
<span id="LS1BSeparation"></span>
 
<span id="LS1BSeparation"></span>
 
===Separation===  
 
===Separation===  
Line 1,379: Line 1,372:
 
1 1 1 1 1 1 1 1    1 1 1 1 1 1 1 1    1 1 1 1 1 1 1 1
 
1 1 1 1 1 1 1 1    1 1 1 1 1 1 1 1    1 1 1 1 1 1 1 1
 
</pre>
 
</pre>
'''x86-mnemonics'''  
+
'''x86-mnemonics'''<br/>
 
[[x86-64]] expansion [[BMI1]] has [[BMI1#BLSMSK|BLSMSK]] (Mask Up to Lowest Set Bit = below_LSB1_mask_including), [[AMD|AMD's]] [[x86-64]] expansion [[TBM]] has [[TBM#TZMSK|TZMSK]] (Mask From Trailing Zeros = below_LSB1_mask):
 
[[x86-64]] expansion [[BMI1]] has [[BMI1#BLSMSK|BLSMSK]] (Mask Up to Lowest Set Bit = below_LSB1_mask_including), [[AMD|AMD's]] [[x86-64]] expansion [[TBM]] has [[TBM#TZMSK|TZMSK]] (Mask From Trailing Zeros = below_LSB1_mask):
 
<pre>
 
<pre>
Line 1,385: Line 1,378:
 
tzmsk  rax, rbx ; TBM:  rax = ~rbx & (rbx - 1)
 
tzmsk  rax, rbx ; TBM:  rax = ~rbx & (rbx - 1)
 
</pre>
 
</pre>
[[BMI1]]-intrinsic [https://software.intel.com/en-us/node/514041 _blsmsk_u32/64].
+
[[BMI1]]-intrinsic [[BMI1#_blsmsk_u3264|_blsmsk_u32/64]]
  
 
===Smearing===  
 
===Smearing===  
Line 1,415: Line 1,408:
 
. . . . . . . .    1 1 1 1 1 1 1 1    1 1 1 1 1 1 1 1
 
. . . . . . . .    1 1 1 1 1 1 1 1    1 1 1 1 1 1 1 1
 
</pre>
 
</pre>
'''x86-mnemonics'''  
+
'''x86-mnemonics''' <br/>
 
[[AMD|AMD's]] [[x86-64]] expansion [[TBM]] has a [[TBM#BLSFILL|Fill From Lowest Set Bit]] instruction:
 
[[AMD|AMD's]] [[x86-64]] expansion [[TBM]] has a [[TBM#BLSFILL|Fill From Lowest Set Bit]] instruction:
 
<pre>
 
<pre>
Line 1,445: Line 1,438:
 
Still quite expensive - better to traverse sets the other way around or rely on intrinsic functions to use special processor instructions like [[BitScan#Bitscanreverse|BitScanReverse]] or LeadingZeroCount, which implicitly performs not only the isolation but also the [https://en.wikipedia.org/wiki/Binary_logarithm log2].
 
Still quite expensive - better to traverse sets the other way around or rely on intrinsic functions to use special processor instructions like [[BitScan#Bitscanreverse|BitScanReverse]] or LeadingZeroCount, which implicitly performs not only the isolation but also the [https://en.wikipedia.org/wiki/Binary_logarithm log2].
  
'''Common MS1B'''  
+
'''Common MS1B''' <br/>
 
Two sets have a common MS1B, if the [[General Setwise Operations#Intersection|intersection]] is greater than the xor sum:
 
Two sets have a common MS1B, if the [[General Setwise Operations#Intersection|intersection]] is greater than the xor sum:
 
<pre>
 
<pre>
Line 1,461: Line 1,454:
 
y  = x * 0x00010100;
 
y  = x * 0x00010100;
 
</pre>
 
</pre>
'''Fill-Multiplication'''  
+
'''Fill-Multiplication'''<br/>
 
In fact, we can replace [[Parallel Prefix Algorithms|parallel prefix]] left shifts like,
 
In fact, we can replace [[Parallel Prefix Algorithms|parallel prefix]] left shifts like,
 
<pre>
 
<pre>
Line 1,487: Line 1,480:
 
See [[Kindergarten Bitboards|Kindergarten-Bitboards]]- or [[Magic Bitboards|Magic-Bitboards]] as applications of fill-multiplication.
 
See [[Kindergarten Bitboards|Kindergarten-Bitboards]]- or [[Magic Bitboards|Magic-Bitboards]] as applications of fill-multiplication.
  
'''De Bruijn Multiplication'''  
+
'''De Bruijn Multiplication''' <br/>
Another bitboard related application of multiplication is to determine the bit-index of the least significant one bit. A isolated, single bit is multiplied with a [[De Bruijn sequence]] to implement a [[Bitscan#DeBruijnMultiplation|bitscan]].
+
Another bitboard related application of multiplication is to determine the bit-index of the least significant one bit. A isolated, single bit is multiplied with a [[De Bruijn Sequence]] to implement a [[BitScan#DeBruijnMultiplation|bitscan]].
 
<span id="Division"></span>
 
<span id="Division"></span>
 
==Division==  
 
==Division==  
Line 1,536: Line 1,529:
 
===Power of Two===  
 
===Power of Two===  
 
As a remainder, and to close the cycle to [[General Setwise Operations#Bitwisebooleanoperations|bitwise boolean operations]], the well known trick is mentioned, to replace modulo by power of two by [[General Setwise Operations#Intersection|intersection]] with power of two minus one:
 
As a remainder, and to close the cycle to [[General Setwise Operations#Bitwisebooleanoperations|bitwise boolean operations]], the well known trick is mentioned, to replace modulo by power of two by [[General Setwise Operations#Intersection|intersection]] with power of two minus one:
<pre>
+
  a % 2<span style="vertical-align: super;">n</span> == a & (2<span style="vertical-align: super;">n</span> - 1)
  a % 2^n == a & (2^n - 1)
 
</pre>
 
  
 
=Selected Publications=
 
=Selected Publications=
Line 1,547: Line 1,538:
 
* [[Mathematician#CSPeirce|Charles S. Peirce]] ('''1867'''). ''On an Improvement in Boole's Calculus of Logic''. Proceedings of the American Academy of Arts and Sciences, Series Vol. 7
 
* [[Mathematician#CSPeirce|Charles S. Peirce]] ('''1867'''). ''On an Improvement in Boole's Calculus of Logic''. Proceedings of the American Academy of Arts and Sciences, Series Vol. 7
 
* [[Mathematician#Cantor|Georg Cantor]] ('''1874'''). ''[https://glossar.hs-augsburg.de/Cantor,_G._(1874):_Ueber_eine_Eigenschaft_des_Inbegriffs_aller_reellen_algebraischen_Zahlen Ueber eine Eigenschaft des Inbegriffes aller reellen algebraischen Zahlen]''. [https://en.wikipedia.org/wiki/Crelle%27s_Journal Journal für die reine und angewandte Mathematik], No. 77
 
* [[Mathematician#Cantor|Georg Cantor]] ('''1874'''). ''[https://glossar.hs-augsburg.de/Cantor,_G._(1874):_Ueber_eine_Eigenschaft_des_Inbegriffs_aller_reellen_algebraischen_Zahlen Ueber eine Eigenschaft des Inbegriffes aller reellen algebraischen Zahlen]''. [https://en.wikipedia.org/wiki/Crelle%27s_Journal Journal für die reine und angewandte Mathematik], No. 77
* [[Mathematician#CSPeirce|Charles S. Peirce]] ('''1880'''). ''[https://archive.org/details/jstor-2369442 On the Algebra of Logic]''. [https://en.wikipedia.org/wiki/American_Journal_of_Mathematics American Journal of Mathematics], Vol. 3
 
 
* [[Mathematician#CSPeirce|Charles S. Peirce]] ('''1880'''). ''[https://archive.org/details/jstor-2369442 On the Algebra of Logic]''. [https://en.wikipedia.org/wiki/American_Journal_of_Mathematics American Journal of Mathematics], Vol. 3
 
* [[Mathematician#CSPeirce|Charles S. Peirce]] ('''1880'''). ''[https://archive.org/details/jstor-2369442 On the Algebra of Logic]''. [https://en.wikipedia.org/wiki/American_Journal_of_Mathematics American Journal of Mathematics], Vol. 3
 
* [[Mathematician#Venn|John Venn]] ('''1880'''). ''[http://www.tandfonline.com/doi/abs/10.1080/14786448008626877#.U3kRnHYfwgI On the Diagrammatic and Mechanical Representation of Propositions and Reasonings]''. [https://en.wikipedia.org/wiki/Philosophical_Magazine Philosophical Magazine], Vol. 9, No. 5
 
* [[Mathematician#Venn|John Venn]] ('''1880'''). ''[http://www.tandfonline.com/doi/abs/10.1080/14786448008626877#.U3kRnHYfwgI On the Diagrammatic and Mechanical Representation of Propositions and Reasonings]''. [https://en.wikipedia.org/wiki/Philosophical_Magazine Philosophical Magazine], Vol. 9, No. 5
 
* [[Mathematician#Venn|John Venn]] ('''1881'''). ''[https://archive.org/details/symboliclogic00venniala Symbolic Logic]''. MacMillan & Co.
 
* [[Mathematician#Venn|John Venn]] ('''1881'''). ''[https://archive.org/details/symboliclogic00venniala Symbolic Logic]''. MacMillan & Co.
 +
==1900 ...==
 +
* [[Claude Shannon]] ('''1938'''). ''[https://en.wikipedia.org/wiki/A_Symbolic_Analysis_of_Relay_and_Switching_Circuits A Symbolic Analysis of Relay and Switching Circuits]''. [https://en.wikipedia.org/wiki/American_Institute_of_Electrical_Engineers Transactions of the AIEE], Vol. 57, No 12, Master's thesis 1940, [[Massachusetts Institute of Technology]]
 +
* [[Mathematician#VIShestakov|Victor I. Shestakov]] ('''1941'''). ''Algebra of Two Poles Schemata''. [https://en.wikipedia.org/wiki/Automation_and_Remote_Control Automatics and Telemechanics], Vol. 5, No 2
 
==1950 ...==
 
==1950 ...==
* [https://en.wikipedia.org/wiki/Lazar_Lyusternik Lazar A. Lyusternik], [http://www.mathnet.ru/php/person.phtml?personid=30351&option_lang=eng Aleksandr A. Abramov], [https://en.wikipedia.org/wiki/Victor_Shestakov Victor I. Shestakov], [[Mikhail R. Shura-Bura]] ('''1952'''). ''Programming for High-Speed Electronic Computers''. (Программирование для электронных счетных машин)
+
* [[Mathematician#LLyusternik|Lazar A. Lyusternik]], [[Mathematician#AAbramov|Aleksandr Abramov]], [[Mathematician#VIShestakov|Victor I. Shestakov]], [[Mikhail R. Shura-Bura]] ('''1952'''). ''Programming for High-Speed Electronic Computers''. (Программирование для электронных счетных машин)
 
* [[Christopher Strachey]] ('''1961'''). ''Bitwise operations''. [[ACM#Communications|Communications of the ACM]], Vol. 4, No. 3
 
* [[Christopher Strachey]] ('''1961'''). ''Bitwise operations''. [[ACM#Communications|Communications of the ACM]], Vol. 4, No. 3
 
==2000 ...==
 
==2000 ...==
 
* [[Henry S. Warren, Jr.]] ('''2002, 2012'''). ''[[Henry S. Warren, Jr.#HackersDelight|Hacker's Delight]]''. [https://en.wikipedia.org/wiki/Addison%E2%80%93Wesley Addison-Wesley]
 
* [[Henry S. Warren, Jr.]] ('''2002, 2012'''). ''[[Henry S. Warren, Jr.#HackersDelight|Hacker's Delight]]''. [https://en.wikipedia.org/wiki/Addison%E2%80%93Wesley Addison-Wesley]
 
* [[Donald Knuth]] ('''2009'''). ''[http://www-cs-faculty.stanford.edu/%7Eknuth/taocp.html The Art of Computer Programming], Volume 4, Fascicle 1: Bitwise tricks & techniques'', as [http://www-cs-faculty.stanford.edu/%7Eknuth/fasc1a.ps.gz Pre-Fascicle 1a postscript]
 
* [[Donald Knuth]] ('''2009'''). ''[http://www-cs-faculty.stanford.edu/%7Eknuth/taocp.html The Art of Computer Programming], Volume 4, Fascicle 1: Bitwise tricks & techniques'', as [http://www-cs-faculty.stanford.edu/%7Eknuth/fasc1a.ps.gz Pre-Fascicle 1a postscript]
 +
* [[Ronald L. Rivest]] ('''2011'''). ''The invertibility of the XOR of rotations of a binary word''. [https://dblp1.uni-trier.de/db/journals/ijcm/ijcm88.html International Journal of Computer Mathematics, Vol. 88], [https://pdfs.semanticscholar.org/8177/7b0bb30aa7ca12f9a1aea57691a6fb1e1249.pdf 2009 pdf preprint]
  
 
=Forum Posts=
 
=Forum Posts=
 +
==2000 ...==
 
* [https://www.stmintz.com/ccc/index.php?id=306882 curiosity killed the cat... hi/lo bit C verses Assembly] by [[Dann Corbit]], [[CCC]], July 17, 2003
 
* [https://www.stmintz.com/ccc/index.php?id=306882 curiosity killed the cat... hi/lo bit C verses Assembly] by [[Dann Corbit]], [[CCC]], July 17, 2003
 
* [https://www.stmintz.com/ccc/index.php?id=450730 mask of highest bit] by [[Andrew Shapira]], [[CCC]], September 21, 2005
 
* [https://www.stmintz.com/ccc/index.php?id=450730 mask of highest bit] by [[Andrew Shapira]], [[CCC]], September 21, 2005
 +
==2010 ...==
 
* [http://www.talkchess.com/forum/viewtopic.php?t=47710 How to Shift Left (by) a Negative Number??] by [[Steve Maughan]], [[CCC]], April 05, 2013
 
* [http://www.talkchess.com/forum/viewtopic.php?t=47710 How to Shift Left (by) a Negative Number??] by [[Steve Maughan]], [[CCC]], April 05, 2013
* [http://www.open-chess.org/viewtopic.php?f=5&t=2878 To shift or not to shift] by thevinenator, [[Computer Chess Forums|OpenChess Forum]], September 09, 2015
+
* [http://www.open-chess.org/viewtopic.php?f=5&t=2878 To shift or not to shift] by thevinenator, [[Computer Chess Forums|OpenChess Forum]], September 09, 2015 » [[Space-Time Tradeoff]]
 +
* [https://groups.google.com/d/msg/fishcooking/r_7TSUtNXls/PUM5JgcSFQAJ Question about resetting a bit in a bitboard corresponding to a given square] by guenther, [[Computer Chess Forums|FishCooking]], September 09, 2016 » [[#ResetBit|Reset Bit]]
 +
* [https://groups.google.com/forum/#!topic/fishcooking/TplkvO62I9U On the speed of SquareBB array] by protonspring, [[Computer Chess Forums|FishCooking]], March 22, 2019
 +
==2020 ...==
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75818 C++20 standard bit operations] by [[Jon Dart]], [[CCC]], November 15, 2020 » [[Population Count]], [[BitScan]], [[Cpp|C++]]
  
 
=External Links=  
 
=External Links=  
Line 1,627: Line 1,627:
 
* [https://en.wikipedia.org/wiki/Linear_congruence_theorem Linear congruence theorem from Wikipedia]
 
* [https://en.wikipedia.org/wiki/Linear_congruence_theorem Linear congruence theorem from Wikipedia]
 
==Misc==
 
==Misc==
* [[Videos#Casiopea|Casiopea]] - Conjunction, [https://en.wikipedia.org/wiki/Casiopea_Perfect_Live_II Perfect Live] (1986), [https://en.wikipedia.org/wiki/YouTube YouTube] Video
+
* [[:Category:Casiopea|Casiopea]] - Conjunction, [https://en.wikipedia.org/wiki/Casiopea_Perfect_Live_II Perfect Live] (1986), [https://en.wikipedia.org/wiki/YouTube YouTube] Video
 
: {{#evu:https://www.youtube.com/watch?v=zdU2KCDYROU|alignment=left|valignment=top}}
 
: {{#evu:https://www.youtube.com/watch?v=zdU2KCDYROU|alignment=left|valignment=top}}
* [[Videos#HuxFlux|Hux Flux]] - Bitshifter, [https://en.wikipedia.org/wiki/Division_by_Zero_%28album%29 Division by Zero], [https://en.wikipedia.org/wiki/YouTube YouTube] Video
+
* [[:Category:Hux Flux|Hux Flux]] - Bitshifter, [https://en.wikipedia.org/wiki/Division_by_Zero_%28album%29 Division by Zero], [https://en.wikipedia.org/wiki/YouTube YouTube] Video
 
: {{#evu:https://www.youtube.com/watch?v=ONhwqC6_NY0|alignment=left|valignment=top}}
 
: {{#evu:https://www.youtube.com/watch?v=ONhwqC6_NY0|alignment=left|valignment=top}}
  
Line 1,636: Line 1,636:
  
 
'''[[Bitboards|Up one Level]]'''
 
'''[[Bitboards|Up one Level]]'''
 +
[[Category:Wassily Kandinsky]]
 +
[[Category:Casiopea]]
 +
[[Category:Tetsuo Sakurai]]
 +
[[Category:Hux Flux]]

Latest revision as of 00:52, 18 January 2022

Home * Board Representation * Bitboards * General Setwise Operations

Wassily Kandinsky - Upward, 1929 [1]

General Setwise Operations,
binary and unary operations, essential in testing and manipulating bitboards within a chess program. Relational operators on bitboards test for equality, bitwise boolean operators perform the intrinsic setwise operations [2] [3], such as intersection, union and complement. Shifting bitboards simulates piece movement, while finally arithmetical operations are used in bit-twiddling applications and to calculate various hash-indicies.

Operators are denoted with focus on the C, C++, Java and Pascal programming languages, as well as the mnemonics of x86 or x86-64 Assembly language instructions including bit-manipulation (BMI1, BMI2, TBM) and SIMD expansions (MMX, SSE2, AVX, AVX2, AVX-512, XOP), Mathematical symbols, some Venn diagrams [4], Truth tables, and bitboard diagrams where appropriate.

Relational

Relational operators on bitboards are the test for equality whether they are the same or not. Greater or less in the arithmetical sense is usually not relevant with bitboards [5] - instead we often compare bit for bit of two bitboards by certain bitwise boolean operations to retrieve bitwise greater, less or equal results.

Equality

In C, C++ or Java "==" is used, to test for equality, "!=" for not equal. Pascal uses "=", "<>" and has ":=" to distinguish relational equal operators from assignment.

if (a == b) -> both sets are equal
if (a != b) -> both sets are not equal

x86-mnemonics
x86 has a cmp-instruction, which internally performs a subtraction to set its internal processor flags (carry, zero, overflow) accordantly, for instance the zero-flag if both sets are equal. Those flags are then used by conditional jump or move instructions.

cmp  rax, rbx ; rax == rbx
je   equal    ; (jz) conditional jump if equal (jne, jnz for not equal)

Empty and Universe

Two important sets are:

  • The empty set is represented by all bits zero.
  • The universal set contains all elements by setting all bits to binary one.

The numerical values and setwise representations of those sets:

empty set E       = 0
 set-wise         = {}

universal set U   = 2^64 - 1
 signed decimal   = -1
 hexadecimal      = 0xffffffffffffffff
 unsigned decimal = 18,446,744,073,709,551,615
 set-wise         = {a1, b1, c1, d1, ....., e8, f8, g8, h8}

as bitboard diagrams and Venn diagrams

Empty
                
      Empty                 Universe
 . . . . . . . .        1 1 1 1 1 1 1 1
 . . . . . . . .        1 1 1 1 1 1 1 1
 . . . . . . . .        1 1 1 1 1 1 1 1
 . . . . . . . .        1 1 1 1 1 1 1 1
 . . . . . . . .        1 1 1 1 1 1 1 1
 . . . . . . . .        1 1 1 1 1 1 1 1
 . . . . . . . .        1 1 1 1 1 1 1 1
 . . . . . . . .        1 1 1 1 1 1 1 1
Universe

Programmers often wonder to use -1 in C, C++ as unsigned constant. See The Two's Complement - alternately one may use ~0 to define the universal set. Since in C or C++, decimal numbers without ULL suffix are treated as 32-bit integers, constants outside the integer range need some care concerning sign or zero extension. Const declarations or using the C64 Macro is recommended:

const U64 universe = 0xffffffffffffffffULL;

To test whether a set is empty or not, one may compare with zero or use the logical not operator '!' in C, C++ or Java:

if (a == 0) -> empty set
if (!a)     -> empty set
if (a != 0) -> set is not empty
if (a)      -> set is not empty

To test for the universal set is less likely:

if (a == universe) -> universal set
if (a + 1 == 0)    -> universal set

Bitwise Boolean

Boolean algebra is an algebraic structure [6] [7] that captures essential properties of both set operations and logical operations. The properties of associativity, commutativity, and absorption, which define an ordered lattice, in conjunction with distributive and complement laws define the Algebra of sets is in fact a Boolean algebra.

Specifically, Boolean algebra deals with the set operations of intersection, union and complement, their equivalents of conjunction, disjunction and negation and their bitwise boolean operations of AND, OR and NOT to implement combinatorial logic in software. Bitwise boolean operations on 64-bit words are in fact 64 parallel operations on each bit performing one setwise operation without any "side-effects". Square mapping don't cares as long all sets use the same.

Intersection

Intersection

In set theory intersection is denoted as:

A ∩ B

In boolean algebra conjunction is denoted as:

a ∧ b

Bitboard intersection or conjunction is performed by bitwise and (binary operator & in C, C++ or Java, and the keyword "AND" in Pascal).

intersection = a & b

Truth Table
Truth table of and for one bit, for a '1' result both inputs need to be '1':

a b a and b
0 0 0
0 1 0
1 0 0
1 1 1

Conjunction acts like a bitwise minimum, min(a, b) or as bitwise multiplication (a * b).

x86-mnemonics
x86 has general purpose instruction as well as SIMD-instructions for bitwise and:

and   rax,  rbx        ; rax &= rbx
test  rax,  rbx        ; to determine whether the intersection is empty
pand  mm0,  mm1        ; MMX   mm0 &= mm1
pand  xmm0, xmm1       ; SSE2 xmm0 &= xmm1
vpand xmm0, xmm1, xmm2 ; AVX  xmm0 = xmm1 & xmm2
vpand ymm0, ymm1, ymm2 ; AVX2 ymm0 = ymm1 & ymm2

SSE2-intrinsic _mm_and_si128
AVX2-intrinsic _mm256_and_si256
AVX-512 has VPTERNLOG

Idempotent
Conjunction is idempotent.

a & a == a

Commutative
Conjunction is commutative

a & b == b & a

Associative
Conjunction is associative.

(a & b) & c == a & (b & c)

Subset
The intersection of two sets is subset of both.

Assume we have a attack set of a queen, and like to know whether the queen attacks opponent pieces it may capture, we need to 'and' the queen-attacks with the set of opponent pieces.

queen attacks    &  opponent pieces  =  attacked pieces
. . . . . . . .     1 . . 1 1 . . 1     . . . . . . . .
. . . 1 . . 1 .     1 . 1 1 1 1 1 .     . . . 1 . . 1 .
. 1 . 1 . 1 . .     . 1 . . . . . 1     . 1 . . . . . .
. . 1 1 1 . . .     . . . . . . . .     . . . . . . . .
1 1 1 * 1 1 1 .  &  . . . * . . 1 .  =  . . . * . . 1 .
. . 1 1 1 . . .     . . . . . . . .     . . . . . . . .
. . . 1 . 1 . .     . . . . . . . .     . . . . . . . .
. . . 1 . . . .     . . . . . . . .     . . . . . . . .

To prove whether set 'a' is subset of another set 'b', we compare whether the intersection equals the subset:

bool isASubsetOfB(U64 a, U64 b) {return (a & b) == a;}

Disjoint Sets
To test whether two sets are disjoint - that is their intersection is empty - compiler emit the x86 test-instruction instead of and. That saves the content of a register, if the intersection is not otherwise needed:

if ( (a & b) == 0 ) -> a and b are disjoint sets

In chess the bitboards of white and black pieces are obviously always disjoint, same for sets of different piece-types, such as knights or pawns. Of course this is because one square is occupied by one piece only.

Union

Union

In set theory union is denoted as:

A ∪ B

In boolean algebra disjunction is denoted as:

a ∨ b

The union or disjunction of two bitboards is applied by bitwise or (binary operator | in C, C++ or Java, or the keyword "OR" in Pascal). The union is superset of the intersection, while the intersection is subset of the union.

union = a | b

Truth Table
Truth table of or for one bit, one set input bits is sufficient to set the output:

a b a or b
0 0 0
0 1 1
1 0 1
1 1 1

Disjunction acts like bitwise maximum, max(a, b) or as addition with saturation, min(a + b, 1). It can also be interpreted as sum minus product, a + b - a*b, with possible temporary overflow of one binary digit to two - or with modulo 2 arithmetic.

x86-mnemonics
x86 has general purpose instruction as well as SIMD-instructions for bitwise or:

or   rax,  rbx        ;       rax |= rbx
por  mm0,  mm1        ; MMX   mm0 |= mm1
por  xmm0, xmm1       ; SSE2 xmm0 |= xmm1
vpor xmm0, xmm1, xmm2 ; AVX  xmm0  = xmm1 | xmm2
vpor ymm0, ymm1, ymm2 ; AVX2 ymm0  = ymm1 | ymm2

SSE2-intrinsic _mm_or_si128
AVX2-intrinsic _mm256_or_si256
AVX-512 has VPTERNLOG

Idempotent
Disjunction is idempotent.

a | a == a

Commutative
Disjunction is commutative

a | b == b | a

Associative
Disjunction is associative.

(a | b) | c == a | (b | c)

Distributive
Disjunction is distributive over conjunction and vice versa:

x | (y & z) == (x | y) & (x | z)
x & (y | z) == (x & y) | (x & z)

Superset
The union of two sets is superset of both. For instance the union of all white and black pieces are the set of all occupied squares:

white pieces     |  black pieces     =  occupied squares
. . . . . . . .     1 . 1 1 1 1 1 1     1 . 1 1 1 1 1 1
. . . . . . . .     1 1 1 1 . 1 1 1     1 1 1 1 . 1 1 1
. . . . . . . .     . . 1 . . . . .     . . 1 . . . . .
. . . . . . . .     . . . . 1 . . .     . . . . 1 . . .
. . . . 1 . . .  |  . . . . . . . .  =  . . . . 1 . . .
. . . . . 1 . .     . . . . . . . .     . . . . . 1 . .
1 1 1 1 . 1 1 1     . . . . . . . .     1 1 1 1 . 1 1 1
1 1 1 1 1 1 . 1     . . . . . . . .     1 1 1 1 1 1 . 1

Since white and black pieces are always disjoint, one may use addition here as well. That fails for union of attack sets, since squares may be attacked or defended by multiple pieces of course.

Complement Set

Complement

In set theory complement set is denoted as:

A

In boolean algebra negation is denoted as:

¬a

The complement set (absolute complement set), negation or ones' complement has it's equivalent in bitwise not (unary operator '~' in C, C++ or Java, or the keyword "NOT" in Pascal).

Truth Table
Truth table of not for one bit:

a not a
0 1
1 0

The complement can be interpreted as bitwise subtraction (1 - a).

x86-mnemonics
Available as general purpose instruction.

not  rax ; rax = ~rax

AVX-512 has VPTERNLOG

Empty Squares
The set of empty squares for instance is the complement-set of all occupied squares and vice versa:

~occupied squares  =   empty squares
  1 . 1 1 1 1 1 1      . 1 . . . . . .
  1 1 1 1 . 1 1 1      . . . . 1 . . .
  . . 1 . . . . .      1 1 . 1 1 1 1 1
  . . . . 1 . . .      1 1 1 1 . 1 1 1
~ . . . . 1 . . .  =   1 1 1 1 . 1 1 1
  . . . . . 1 . .      1 1 1 1 1 . 1 1
  1 1 1 1 . 1 1 1      . . . . 1 . . .
  1 1 1 1 1 1 . 1      . . . . . . 1 .

Don't confuse bitwise not with logical not-operator '!' in C:

!0 == 1
!(anything != 0) == 0
!1  == 0
!-1 == 0

Complement laws

  • The union of a set with it's complement is the universal set -1.
  • The intersection of a set with it's complement is the empty set 0 - both are disjoint.
  • Empty set and universal set are complement sets.
a  | ~a == -1
a  & ~a ==  0
~0      == -1
~(-1)   ==  0

De Morgan's laws

~(a | b) == ~a & ~b
~(a & b) == ~a | ~b

For instance to get the set of empty squares, we can complement the union of white and black pieces. Or we can intersect the complements of white and black pieces.

Relative Complement

Relative Complement

In set theory relative complement is denoted as:

A ∩ B = B \ A

The relative complement is the absolute complement restricted to some other set. The relative complement of 'a' inside 'b' is also known as the set theoretic difference of 'b' minus 'a'. It is the set of all elements that belong to 'b' but not to 'a'. Also called 'b' without 'a'. It is the intersection of 'b' with the absolute complement of 'a'.

not_a_in_b  = ~a &  b
b_without_a =  b & ~a

Truth Table
Truth table of relative complement for one bit:

a b b andnot a
0 0 0
0 1 1
1 0 0
1 1 0

The relative complement of 'a' in 'b' may be interpreted as a bitwise (a < b) relation.

x86-mnemonics
x86 don't has an own general purpose instruction for relative complement, but x86-64 expansion BMI1, and SIMD-instructions:

andn   rax,  rbx,  rcx  ; BMI1  rax = ~rbx & rcx
pandn  mm0,  mm1        ; MMX   mm0 = ~mm0 & mm1
pandn  xmm0, xmm1       ; SSE2 xmm0 = ~xmm0 & xmm1
vpandn xmm0, xmm1, xmm2 ; AVX  xmm0 = ~xmm1 & xmm2
vpandn ymm0, ymm1, ymm2 ; AVX  xmm0 = ~xmm1 & xmm2

SSE2-intrinsic _mm_andnot_si128
AVX2-intrinsic _mm256_andnot_si256
AVX-512 has VPTERNLOG

Super minus Sub
In presumption of subtraction or exclusive or there are alternatives to calculate the relative complement - superset minus subset. We can take either the union without the complementing set - or the other set without the intersection

~a & b == ( a | b ) - a
~a & b == b - ( a & b )

Implication

Implication

Logical Implication or Entailment is denoted as:

A ⇒ B

The boolean Material conditional is denoted as:

a → b

Logical Implication or the boolean Material conditional 'a' implies 'b' (if 'a' then 'b') is an derived boolean operation, implemented as union of the absolute complement of 'a' with 'b':

a_implies_b ==  ~a | b

Truth Table Truth table of logical implication for one bit:

a b a implies b
0 0 1
0 1 1
1 0 0
1 1 1

Implication may be interpreted as a bitwise (a <= b) relation.

x86-mnemonics
AVX-512 has VPTERNLOG

Exclusive Or

Exclusive Or

In set theory symmetric difference is denoted as:

A ∆ B

In boolean algebra Exclusive or is denoted as:

a ⊕ b

Exclusive or, also exclusive disjunction (xor, binary operator '^' in C, C++ or Java, or the keyword "XOR" in Pascal),

xor = a ^ b

also called symmetric difference, leaves all elements which are exclusively set in one of the two sets. Xor is really a multi purpose operation with a lot of applications not only bitboards of course.

1 . . . . . . 1     . . . . . . . .     1 . . . . . . 1
. 1 . . . . 1 .     . . . . . . . .     . 1 . . . . 1 .
. . 1 . . 1 . .     . . 1 1 1 1 . .     . . . 1 1 . . .
. . . 1 1 . . .     . . 1 1 1 1 . .     . . 1 . . 1 . .
. . . 1 1 . . .  ^  . . 1 1 1 1 . .  =  . . 1 . . 1 . .
. . 1 . . 1 . .     . . 1 1 1 1 . .     . . . 1 1 . . .
. 1 . . . . 1 .     . . . . . . . .     . 1 . . . . 1 .
1 . . . . . . 1     . . . . . . . .     1 . . . . . . 1

Truth Table
Truth table of exclusive or for one bit:

a b a xor b
0 0 0
0 1 1
1 0 1
1 1 0

Xor implements a bitwise (a != b) relation. It acts like a bitwise addition (modulo 2), since (1 + 1) mod 2 = 0. It also acts like a bitwise subtraction (modulo 2).

x86-mnemonics
x86 has general purpose instruction as well as SIMD-instructions for bitwise exclusive or:

xor   rax,  rbx        ;       rax ^= rbx
pxor  mm0,  mm1        ; MMX   mm0 ^= mm1
pxor  xmm0, xmm1       ; SSE2 xmm0 ^= xmm1
vpxor xmm0, xmm1, xmm2 ; AVX  xmm0  = xmm1 ^ xmm2
vpxor ymm0, ymm1, ymm2 ; AVX2 ymm0  = ymm1 ^ ymm2

SSE2-intrinsic _mm_xor_si128
AVX2-intrinsic _mm256_xor_si256
AVX-512 has VPTERNLOG

Commutative
Exclusive disjunction is commutative

a ^ b == b ^ a

Associative
Xor is associative as well.

(a ^ b) ^ c == a ^ (b ^ c)

Distributive
Conjunction is distributive over exclusive disjunction - but not vice versa, since conjunction acts like multiplication, while xor acts as addition in the Galois field GF(2) :

x & (y ^ z) == (x & y) ^ (x & z)

Own Inverse
If applied two (even) times with the same operand, xor restores the original result. It is own inverse or an involution .

Subset
If one operand is subset of the other, xor (or subtraction) implements the relative complement.

super               sub                 super &~ sub
. . . . . . . .     . . . . . . . .     . . . . . . . .
. 1 1 1 1 1 1 .     . . . . . . . .     . 1 1 1 1 1 1 .
. 1 1 1 1 1 1 .     . . 1 1 1 1 . .     . 1 . . . . 1 .
. 1 1 1 1 1 1 .  ^  . . 1 1 1 1 . .     . 1 . . . . 1 .
. 1 1 1 1 1 1 .     . . 1 1 1 1 . .  =  . 1 . . . . 1 .
. 1 1 1 1 1 1 .  -  . . 1 1 1 1 . .     . 1 . . . . 1 .
. 1 1 1 1 1 1 .     . . . . . . . .     . 1 1 1 1 1 1 .
. . . . . . . .     . . . . . . . .     . . . . . . . .

Subtraction
While commutative, xor is a better replacement for subtracting from power of two minus one values, such as 63.

(2n - 1) - a == a ^ (2n - 1) with a subset of 2n - 1

This is because it usually safes one x86 load instruction and an additional register, but uses opcodes with immediate operands - for instance:

 1 - a == a ^  1
 3 - a == a ^  3
 7 - a == a ^  7
15 - a == a ^ 15
31 - a == a ^ 31
63 - a == a ^ 63
...
-1 - a == a ^ -1

Or without And
Xor is the same as a union without the intersection - all the bits different, 0,1 or 1,0. Since the intersection is subset of the union, xor or subtraction can replace the "without" operation & ~:

a ^ b == (a | b) &~(a & b)
a ^ b == (a | b) ^ (a & b)
a ^ b == (a | b) - (a & b)

Disjoint Sets
The symmetric difference of disjoint sets is equal to the union or arithmetical addition. Since intersection and symmetric difference are disjoint, the union might defined that way:

a | b = ( a & b ) ^ ( a ^ b )
a | b = ( a & b ) ^   a ^ b
a | b = ( a & b ) | ( a ^ b )
a | b = ( a & b ) + ( a ^ b )

Assume we have distinct attack sets of pawns in left or right direction. The set of all squares attacked by two pawns is the intersection, the set exclusively attacked by one pawn (either right or left) is the xor-sum, while all squares attacked by any pawn is the union, see pawn attacks.

Union of Complements
The symmetric difference is equivalent to the union of both relative complements. Since both relative complements are disjoint, bitwise or or add can replaced by xor itself:

a ^ b == (a & ~b) | (b & ~a)
a ^ b == (a & ~b) ^ (b & ~a)
a ^ b == (a & ~b) + (b & ~a)

Toggle
Xor can be used to toggle or flip bits by a mask.

x ^= mask;

Complement
xor with the universal set -1 flips each bit and results in the ones' complement.

a ^ -1 == ~a

Without
Due to distributive law and since symmetric difference of set and subset is the relative complement of subset in set, there are some equivalent ways to calculate the relative complement by xor. Based on surrounding expressions or whether subexpressions such as union, intersection or symmetric difference may be reused one may prefer the one or other alternative.

a & ~b == a & (-1 ^ b )
a & ~b == a & ( a ^ b )
a & ~b == a ^ ( a & b ) == a - ( a & b )
a & ~b == b ^ ( a | b ) == ( a | b ) - b

Also note that

a & a == a & -1

Clear
Since 'a' xor 'a' is zero, it is the shorter opcode to clear a register, since it takes no immediate operand. Applied by optimizing compilers. Same is true for subtraction by the way.

xor  rax, rax   ; same as mov rax, 0
pxor mm0, mm0   ; MMX 64-bit register
pxor xmm0, xmm0 ; SSE2 - 128-bit xmm-register

Xor Swap
Three xors on the same registers swap their content: (Note: this only works when a and b are stored on distinct memory adresses!)

a ^= b
b ^= a
a ^= b

If we provide an intersection by a mask, ...

a = (a ^ b) & mask
b ^= a
a ^= b

... 'a' becomes 'b', but only a part of 'b', where mask is one, becomes 'a'. Bits from two Sources
Getting arbitrary, disjoint bits from two sources by a mask:

// if mask-bit is zero, bit from a, otherwise from b - since a^(a^b) == b
U64 mask = C64(0xFFFF0000FFFF0000);
U64 result = a ^ ((a ^ b) & mask);

This takes one instruction less, than the union of relative complement of the mask in 'a' with intersection of mask with 'b'.

    a ^    ((a ^ b) & mask)
== (a & ~mask) | (b & mask)
== (a & ~mask) ^ (b & mask) because both sets of the union are disjoint
== (a & ~mask) + (b & mask) because both sets of the union are disjoint

XOR-applications and affairs

Equivalence

Equivalence

If and only if (Iff) is denoted as:

A ⇔ B

Logical equivalence is denoted as:

a ↔ b

Logical equality, logical equivalence or biconditional (if and only if, XNOR ) is the complement of xor.

a_equal_b == ~(a ^ b)
a_equal_b ==  (a & b) | (~a & ~b)
a_equal_b ==  (a & b) | ~(a | b)

Truth Table
Truth table of equivalence or for one bit:

a b a ↔ b
0 0 1
0 1 0
1 0 0
1 1 1

Equivalence implements a bitwise (a == b) relation.

x86-mnemonics
AVX-512 has VPTERNLOG

Majority

The majority function or median operator is a function from n inputs to one output. The value of the operation is false when n/2 or fewer arguments are false, and true otherwise. For two inputs it is the intersection. Three inputs require some more computation:

Truth Table
Truth table of majority for three inputs:

a b c maj(a,b,c) 0 0 0 0
0 0 1 0
0 1 0 0
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 1
1 1 1 1
major(a,b,c) = (a & b) |  (a & c) | (b & c);
major(a,b,c) = (a & b) | ((a ^ b ) & c);

See the application of cardinality of multiple sets for more than three inputs.

x86-mnemonics
AVX-512 VPTERNLOG imm8 = 0xe8 implements the majority function.

Greater One Sets

Greater One is a function from n inputs to one output. The value of the operation is true if more than one argument is true, false otherwise. Obviously, for two inputs it is the intersection, for three inputs it is the majority function. For more inputs it is the union of all distinct pairwise intersections, which can be expressed with setwise operators that way:

i>j∈I(Ai ∩ Aj)

With four bitboards this is equivalent to:

  (a1 & a0)

| (a2 & a1)
| (a2 & a0)

| (a3 & a2)
| (a3 & a1)
| (a3 & a0)

with

n * (n - 1) - 1

operations - that is 11 for n == 4.

O(n^2) to O(n)
Due to distibutive law one can factor out common sets ...

  (a1 & (      a0))
| (a2 & (   a1|a0))
| (a3 & (a2|a1|a0))

... with further reductions of the number of operations, also due to aggregation of the inner or-terms. Three additional operations for an increment of n, thus the former quadratic increase becomes linear.

In general, as mentioned,

i>j∈I(Ai ∩ Aj)

requires

n * (n - 1) - 1

operations, which can be reduced to

3 * (n - 1) - 2

operations.

This O(n^2) to O(n) simplification is helpful to determine for instance knight fork target squares from eight distinct knight-wise direction attack sets of potential targets, like king, queen, rooks and hanging bishops or even pawns - or any other form of at least double attacks from n attack bitboards:

U64 attack[n]; // 0..n-1
U64 atLeastDouble = 0;
U64 atLeastSingle = a[0];
for (i=1; i < n; i++) {
  atLeastDouble |= attack[i] & atLeastSingle;
  atLeastSingle |= attack[i];
}

Well, if you need additionally at least triple attacks, you'll get the idea how this would work as well, see also Odd and Major Digit Counts from the Population Count page.

Shifting Bitboards

In the 8*8 board centric world with one scalar square-coordinate 0..63, each of the max eight neighboring squares can be determined by adding an offset for each direction. For border squares one has to care about overflows and wraps from a-file to h-file or vice versa. Some conditional code is needed to avoid that. Such code is usually part of move generation for particular pieces.

  northwest    north   northeast
  noWe         nort         noEa
          +7    +8    +9
              \  |  /
  west    -1 <-  0 -> +1    east
              /  |  \
          -9    -8    -7
  soWe         sout         soEa
  southwest    south   southeast
Cpwmappinghint.JPG
Code samples and bitboard diagrams rely on Little endian file and rank mapping.

In the setwise world of bitboards, where a square as member of a set is determined by an appropriate one-bit 2^square, the operation to apply such movements is shifting . Unfortunately most architectures don't support a "generalized" shift by signed values but only shift left or shift right. That makes bitboard code less general as one has usually separate code for each direction or at least for the positive and negative directions.

  • Shift left (<<) is arithmetically a multiplication by power of two.
  • Shift right (>> or >>> in Java [10]) is arithmetically a division by power of two.

Since the square-index is encoded as power of two exponent inside a bitboard, the power of two multiplication or division is adding or subtracting the square-index.

The reason the bitboard type-definintion is unsigned in C, C++ is to avoid so called arithmetical shift right in opposition to logical shift right . Arithmetical shift right implies filling one-bits in from MSB-direction if the operand is negative and has MSB bit 63 set. Logical shift right always shifts in zeros - that is what we need. Java has no unsigned types, but a special unsigned shift right operator >>>.

x86-mnemonics
x86 has general purpose instructions, BMI2 general purpose instructions not affecting processor flags, as well as SIMD-instructions for various shifts:

shr      rax,  cl         ;       rax >>= cl
shl      rax,  cl         ;       rax <<= cl
shrx     r64a, r/m64, r64b; BMI2  r64a = r/m64 >> r64b
shlx     r64a, r/m64, r64b; BMI2  r64a = r/m64 << r64b 
psrlq    mm0,  mm1        ; MMX   mm0 >>= mm1
psllq    mm0,  mm1        ; MMX   mm0 <<= mm1
psrlq    xmm0, xmm1       ; SSE2 xmm0 >>= xmm1
psllq    xmm0, xmm1       ; SSE2 xmm0 <<= xmm1
vpsrlvq  ymm0, ymm1, ymm2 ; AVX2 ymm0   = ymm1 >> ymm2 ; Individual shifts
vpsllvq  ymm0, ymm1, ymm2 ; AVX2 ymm0   = ymm1 << ymm2 ; Individual shifts

SSE2-intrinsics with variable register or constant immediate shift amounts, working on vectors of two bitboards:

AVX2 has individual shifts for each of four bitboards:

One Step Only

The advantage with bitboards is, that the shift applies to all set bits in parallel, e.g. with all pawns. Vertical shifts by +-8 don't need any under- or overflow conditions since bits simply fall out and disappear.

U64 soutOne (U64 b) {return  b >> 8;}
U64 nortOne (U64 b) {return  b << 8;}

Wraps from a-file to h-file or vice versa may be considered by only shifting subsets which may not wrap. Thus we can mask off the a- or h-file before or after a +-1,7,9 shift:

const U64 notAFile = 0xfefefefefefefefe; // ~0x0101010101010101
const U64 notHFile = 0x7f7f7f7f7f7f7f7f; // ~0x8080808080808080

Post-shift masks, ...

U64 eastOne (U64 b) {return (b << 1) & notAFile;}
U64 noEaOne (U64 b) {return (b << 9) & notAFile;}
U64 soEaOne (U64 b) {return (b >> 7) & notAFile;}
U64 westOne (U64 b) {return (b >> 1) & notHFile;}
U64 soWeOne (U64 b) {return (b >> 9) & notHFile;}
U64 noWeOne (U64 b) {return (b << 7) & notHFile;}

... and pre-shift, with the mirrored file masks.

U64 eastOne (U64 b) {return (b & notHFile) << 1;}
U64 noEaOne (U64 b) {return (b & notHFile) << 9;}
U64 soEaOne (U64 b) {return (b & notHFile) >> 7;}
U64 westOne (U64 b) {return (b & notAFile) >> 1;}
U64 soWeOne (U64 b) {return (b & notAFile) >> 9;}
U64 noWeOne (U64 b) {return (b & notAFile) << 7;}

SSE2 one step only provides some optimizations according to the wraps on vectors of two bitboards.

Main application of shifts is to get attack sets or move-target sets of appropriate pieces, eg. one step for pawns and king. Applying one step multiple times may used to generate attack sets and moves of pieces like knights and sliding pieces.

For instance all push-targets of white pawns can be determined with one shift left plus intersection with empty squares.

whiteSinglePawnPushTargets = nortOne(whitePawns) & emptySquares;

Square-Mapping is crucial while shifting bitboards. Shifting left inside a computer word may mean shifting right on the board with little-endian file-mapping as used in most sample code here.

Rotate

For the sake of completeness - Rotate is similar to shift but wraps bits around. Rotate does not alter the number of set bits. With x86-64 like shift operand s modulo 64, each bit index i, in the 0 to 63 range, is transposed by

rotateLeft ::=  i := (i + s) mod 64
rotateRight::=  i := (i - s) mod 64

Additionally, following relations hold:

rotateLeft (s) == rotateRight(64-s)
rotateRight(s) == rotateLeft (64-s)

Most processors have rotate instructions, but are not supported by standard programming languages like C or Java. Some compilers provide intrinsic, processor specific functions.

U64 rotateLeft (U64 x, int s) {return _rotl64(x, s);}
U64 rotateRight(U64 x, int s) {return _rotr64(x, s);}

x86-mnemonics

rol  rax, cl
ror  rax, cl

Rotate by Shift
Otherwise rotate has to be emulated by shifts, with some chance optimizing compiler will emit exactly one rotate instruction.

U64 rotateLeft (U64 x, int s) {return (x << s) | (x >> (64-s));}
U64 rotateRight(U64 x, int s) {return (x >> s) | (x << (64-s));}

Since x86-64 64-bit shifts are implicitly modulo 64 (and 63), one may replace (64-s) by -s.

Generalized Shift

shifts left for positive amounts, but right for negative amounts.

U64 genShift(U64 x, int s) {
   return (s > 0) ? (x << s) : (x >> -s);
}

If compiler are not able to produce speculative execution of both shifts with a conditional move instruction, one may try an explicit branch-less solution:

/**
 * generalized shift
 * @author Gerd Isenberg
 * @param x any bitboard
 * @param s shift amount -64 < s < +64
 *          left if positive
 *          right if negative
 * @return shifted bitboard
 */
U64 genShift(U64 x, int s) {
   char left  =   (char) s;
   char right = -((char)(s >> 8) & left);
   return (x >> right) << (right + left);
}

Due to the value range of the shift, one may save the arithmetical shift right in assembly:

 ; input
 ;     ecx - shift amount,
 ;           left if positive
 ;           right if negative
 ;     rax - bitboard to shift
 mov   dl,  cl
 and   cl,  ch
 neg   cl
 shr   rax, cl
 add   cl,  dl
 shl   rax, cl

One Step
x86-64 rot64 works like a generalized shift with positive or negative shift amount - since it internally applies an unsigned modulo 64 ( & 63) and makes -i = 64-i. We need to clear either the lower or upper bits by intersection with a mask, which might be combined with the wrap-ands for one step. It might be applied to get attacks for both sides with a direction parameter and small lookups for shift amount and wrap-ands - instead of multiple code for eight directions. Of course generalized shift will be a bit slower due to lookups and using cl as the shift amount register.

// positve left, negative right shifts
int shift[8] = {9, 1,-7,-8,-9,-1, 7, 8};

U64 avoidWrap[8] =
{
   0xfefefefefefefe00,
   0xfefefefefefefefe,
   0x00fefefefefefefe,
   0x00ffffffffffffff,
   0x007f7f7f7f7f7f7f,
   0x7f7f7f7f7f7f7f7f,
   0x7f7f7f7f7f7f7f00,
   0xffffffffffffff00,
};

U64 shiftOne (U64 b, int dir8) {
   return _rotl64(b, shift[dir8]) & avoidWrap[dir8];
}

The avoidWrap masks by some arbitrary dir8 enumeration and shift amount:

6 == noWe -> +7     7 == nort -> +8     0 == noEa -> +9
0x7F7F7F7F7F7F7F00  0xFFFFFFFFFFFFFF00  0xFEFEFEFEFEFEFE00
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
. . . . . . . .     . . . . . . . .     . . . . . . . .

5 == west -> -1                         1 == east -> +1
0x7F7F7F7F7F7F7F7F                      0xFEFEFEFEFEFEFEFE
1 1 1 1 1 1 1 .                         . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .                         . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .                         . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .                         . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .                         . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .                         . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .                         . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .                         . 1 1 1 1 1 1 1

4 == soWe -> -9     3 == sout -> -8     2 == soEa -> -7
0x007F7F7F7F7F7F7F  0x00FFFFFFFFFFFFFF  0x00FEFEFEFEFEFEFE
. . . . . . . .     . . . . . . . .     . . . . . . . .
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1
1 1 1 1 1 1 1 .     1 1 1 1 1 1 1 1     . 1 1 1 1 1 1 1

See also

Bit by Square

Since single populated bitboards are always power of two values, shifting 2^0 left implements pow2(square) to convert square-indices to a member of a bitboard.

U64 singleBitset = C64(1) << square; // or lookup[square]

The inverse function square = log2(x), is topic of bitscan and bitboard serialization.

Shift versus Lookup
While 1 << square sounds cheap, it is rather expensive in 32-bit mode - and therefor often precalculated in a small lookup-table of 64-single bit bitboards. Also, on x86-64-processors a variable shift is restricted to the byte-register cl. Thus, two or more variable shifts are constrained by sequential execution [11].

Test
Test a bit of a square-index by intersection-operator 'and'.

if (x & singleBitset) -> bit is set;

Set
Set a bit of a square-index by union-operator 'or'.

x |=  singleBitset; // set bit

Toggle
Toggle a bit of square-index by xor.

x ^=  singleBitset; // toggle bit

Reset
Reset a bit of square-index by relative complement of the single bit,

x &= ~singleBitset; // reset bit

or conditional toggle by single bit intersection

x ^=  singleBitset & x; // reset bit

Set and toggle (or, xor) might the faster way to reset a bit inside a register (not, and).

x |=  singleBitset; // set bit
x ^=  singleBitset; // resets set bit

If singleBitset needs to preserved, an extra register is needed for the complement.

x86-Instructions
x86 processor provides a bit-test instruction family (bt, bts, btr, btc) with 32- and 64-bit operands. They may be used implicitly by compiler optimization or explicitly by inline assembler or compiler intrinsics. Take care that they are applied on local variables likely registers rather than memory references [12]:

Update by Move

This technique to toggle bits by square is likely used to initialize or update the bitboard board-definition. While making or unmaking moves, the single bit either correspondents with the from- or to-square of the move. Which particular bitboard has to be updated depends on the moving piece or captured piece.

For simplicity we assume piece plus color and captured piece are member or method of a move-structure/class.

Quiet moves toggle both from- and to-squares of the piece-bitboard, as well for the redundant union-sets:

U64 fromBB   = C64(1) << move->from;
U64 toBB     = C64(1) << move->to;
U64 fromToBB = fromBB ^ toBB; // |+
pieceBB[move->piece]  ^=  fromToBB;   // update piece bitboard
pieceBB[move->color]  ^=  fromToBB;   // update white or black color bitboard
occupiedBB            ^=  fromToBB;   // update occupied ...
emptyBB               ^=  fromToBB;   // ... and empty bitboard

Captures need to consider the captured piece of course:

U64 fromBB   = C64(1) << move->from;
U64 toBB     = C64(1) << move->to;
U64 fromToBB = fromBB ^ toBB; // |+
pieceBB[move->piece]  ^=  fromToBB;   // update piece bitboard
pieceBB[move->color]  ^=  fromToBB;   // update white or black color bitboard
pieceBB[move->cPiece] ^=  toBB;       // reset the captured piece
pieceBB[move->cColor] ^=  toBB;       // update color bitboard by captured piece
occupiedBB            ^=  fromBB;     // update occupied, only from becomes empty
emptyBB               ^=  fromBB;     // update empty bitboard

Similar for special moves like castling, promotions and en passant captures. Upper Squares
To get a set of all upper squares or bits, either shift ~1 or -2 left by square:

U64 upperBits =  C64(~1) << sq;

for instance d4 (27)

high = ~1 << d4
 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1
 . . . . 1 1 1 1
 . . . . . . . .
 . . . . . . . .
 . . . . . . . .

Lower Squares
Lower squares are simply Bit by Square minus one.

U64 lowerBits = (C64(1 ) << sq) - 1);

for instance d4 (27)

low = (1<<d4)-1
 . . . . . . . .
 . . . . . . . .
 . . . . . . . .
 . . . . . . . .
 1 1 1 . . . . .
 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1

Swapping Bits

Swapping none overlapping bit-sequences in a bitboard is the base of a lot of permutation tricks.

by Position
Suppose we like to swap n bits from two none overlapping bit locations of a bitboard. The trick is to set all n least significant bits by subtracting one from n power of 2. Both substrings are shifted to bit zero, exclusive ored and masked by the n ones. This sequence is then twice shifted back to their original places, while the union (xor-union due to disjoint bits) is finally exclusive ored with the original bitboard to swap both sequences.

/**
 * swap n none overlapping bits of bit-index i with j
 * @param b any bitboard
 * @param i,j positions of bit sequences to swap
 * @param n number of consecutive bits to swap
 * @return bitboard b with swapped bit-sequences
 */
U64 swapNBits(U64 b, int i, int j, int n) {
   U64     m = ( 1 << n) - 1;
   U64     x = ((b >> i) ^ (b >> j)) & m;
   return  b ^  (x << i) ^ (x << j);
}

For instance swap 6 bits each, from bit-index 9 (bits named ABCDEF, either 0,1) with bit-index 41 (abcdef):

b                                       m = (1<<6) - 1
. . . . . . . .                         . . . . . . . .
* . . . . . . .                         . . . . . . . .
*|a b c d e f|*                         . . . . . . . .
. . . . . . . .                         . . . . . . . .
. . . . . . . .                         . . . . . . . .
. . . . . . . .                         . . . . . . . .
*|A B C D E F|*                         . . . . . . . .
. . . . . . . .                         1 1 1 1 1 1 . .

b >> j           ^  b >> i          =>  x = .xor & m       with
. . . . . . . .     . . . . . . . .     . . . . . . . .
. . . . . . . .     . . . . . . . .     . . . . . . . .
. . . . . . . .     . . . . . . . .     . . . . . . . .    r = a ^ A
. . . . . . . .     a b c d e f * *     . . . . . . . .    s = b ^ B
. . . . . . . .  ^  . . . . . . . * =>  . . . . . . . .    t = c ^ C
. . . . . . . .     . . . . . . . .     . . . . . . . .    u = d ^ D
. . . . . . . .     . . . . . . . .     . . . . . . . .    v = e ^ E
a b c d e f * *     A B C D E F * .     r s t u v w . .    w = f ^ F

b               ^  x << i | x << j  => swapNBits(9,41,6)
. . . . . . . .    . . . . . . . .     . . . . . . . .
* . . . . . . .    . . . . . . . .     * . . . . . . .
*|a b c d e f|*    . r s t u v w .     *|A B C D E F|*
. . . . . . . .    . . . . . . . .     . . . . . . . .
. . . . . . . . ^  . . . . . . . .  => . . . . . . . .
. . . . . . . .    . . . . . . . .     . . . . . . . .
*|A B C D E F|*    . r s t u v w .     *|a b c d e f|*
. . . . . . . .    . . . . . . . .     . . . . . . . .

Delta Swap
To swap any none overlapping pairs we can shift by the difference (j-i, with j>i) and supply an explicit mask with a '1' on the least significant position for each pair supposed to be swapped.

/**
 * swap any none overlapping pairs of bits
 *   that are delta places apart
 * @param b any bitboard
 * @param mask has a 1 on the least significant position
 *             for each pair supposed to be swapped
 * @param delta of pairwise swapped bits
 * @return bitboard b with bits swapped
 */
U64 deltaSwap(U64 b, U64 mask, int delta) {
   U64 x = (b ^ (b >> delta)) & mask;
   return   x ^ (x << delta)  ^ b;
}

To apply the swapping of the swapNBits sample above, we call deltaSwap with delta of 32 and 0x7E00 as mask. But we may apply any arbitrary and often periodic mask pattern, as long as no overlapping occurs. The intersection of mask with (mask << delta) must therefor be empty. But we can also swap odd or even files of a bitboard by calling deltaSwap with delta of one, and mask of 0x5555555555555555:

1 . 1 . 1 . 1 .
1 . 1 . 1 . 1 .
1 . 1 . 1 . 1 .
1 . 1 . 1 . 1 .
1 . 1 . 1 . 1 .
1 . 1 . 1 . 1 .
1 . 1 . 1 . 1 .
1 . 1 . 1 . 1 .

Applications of delta swaps are flipping, mirroring and rotating. In Knuth's The Art of Computer Programming, Vol 4, page 13, bit permutation in general [13], he mentions 2^k delta swaps with k = {0,1,2,3,4,5,4,3,2,1,0} to obtain any arbitrary permutation. Special cases might be cheaper.

Arithmetic Operations

At the first glance, arithmetic operations, that is addition, subtraction, multiplication and division, doesn't make much sense with bitboards. Still, there are some bit-twiddling applications related to least significant one bit (LS1B), to enumerate all subsets of a set or sliding attack generation. Multiplication of certain pattern has some applications as well, most likely to calculate hash-indicies of masked occupancies.

Derived from Bitwise

Half Adder

Unlike bitwise boolean operations on 64-bit words, which are in fact 64 parallel operations on each bit without any interaction between them, arithmetic operations like addition need to propagate possible carries from lower to higher bits. Despite, Add and Sub are usually as fast their bitwise boolean counterparts, because they are implemented in Hardware within the ALU of the CPU. A so called half-adder to add two bits (A, B), requires an And-Gate for the carry (C) and a Xor-Gate for the sum (S):

two_bitsum = (bitA ^ bitB) | ((bitA & bitB) << 1);

To get an idea of the "complexity" of a simple addition, and how to implement an carry-lookahead adder in software with bitwise boolean and shift instructions only, and presumption on parallel prefix algorithms, this is how a 64-bit Kogge-Stone adder would look like in C:

U64 koggeStoneAdd(U64 a, U64 b) {
   U64 gen = a&b;  // carries
   U64 pro = a^b;  // sum
   gen |= pro & (gen << 1);
   pro  = pro & (pro << 1);
   gen |= pro & (gen << 2);
   pro  = pro & (pro << 2);
   gen |= pro & (gen << 4);
   pro  = pro & (pro << 4);
   gen |= pro & (gen << 8);
   pro  = pro & (pro << 8);
   gen |= pro & (gen <<16);
   pro  = pro & (pro <<16);
   gen |= pro & (gen <<32);
   return a^b ^ (gen << 1);
}

Addition

Addition might be used instead of bitwise 'xor' or 'or' for a union of disjoint (intersection zero) sets, which may yield to simplification of the surrounding expression or may take advantage of certain address calculation instruction such as x86 load effective address (lea).

The enriched algebra with arithmetical and bitwise-boolean operations becomes aware with following relation - the bitwise overflows are the intersection, otherwise the sum modulo two is the symmetric difference - thus the arithmetical sum is the xor-sum plus the carries shifted left one:

x + y = (x ^ y) + 2*(x & y)
x ^ y =  x + y  - 2*(x & y)

This is particular interesting in SWAR-arithmetic, or if we like to compute the average without possible temporary overflows:

(x + y) / 2 = ((x ^ y)>>1) + (x & y)

x86-mnemonics

add  rax, rbx ; rax += rbx
lea  rax, [rcx + rdx + const ] ; rax = rcx + rdx + const

Subtraction

Subtraction (like xor) might be used to implement the relative complement, of a subset inside it's superset. As mentioned, subtraction may be useful in calculating sliding attacks.

x86-mnemonics

sub  rax, rbx ; rax -= rbx

The Two's Complement

A lot of bit-twiddling tricks on bitboards to traverse or isolate subsets, rely on two's complement arithmetic. Most recent processors (and compiler or interpreter for these processors) use the two's complement to implement the unary minus operator for signed as well for unsigned integer types. In C it is guaranteed for unsigned integer types. Java guarantees two's complement for all implicit signed integral types char, short, int, long.

x86-mnemonics

neg  rax;  rax = -rax; rax *= -1

2^N is used as power operator in this paragraph not xor !

Increment of Complement
The two's complement is defined as a value, we need to add to the original value to get 264 which is an "overflowed" zero - since all 64-bit values are implicitly modulo 264. Thus, the two's complement is defined as ones' complement plus one:

-x == ~x + 1

That fulfills the condition that x + (-x) == 2bitsize (264) which overflows to zero:

x + (-x)     == 0
x +  ~x + 1  == 0
==>   x + ~x == -1 the universal set

Complement of Decrement
Replacing x by x - 1 in the increment of complement formula, leaves another definition - two's complement or Negation is also the ones' complement of the ones' decrement:

-x == ~(x - 1)

Thus, we can reduce subtraction by addition and ones' complement:

~(x - y) ==   ~x + y
  x - y  == ~(~x + y)

Bitwise Copy/Invert
The two's complement may also defined by a bitwise copy-loop from right (LSB) to left (MSB):

Copy bits from source to destination from right to left
- until the first binary "one" is copied.
Then invert each of the remaining higher bits.

Signed-Unsigned
This works independently whether we interpret 'x' as signed or unsigned. While 0 is is the synonym for all bits clear, -1 is the synonym for all bits set in a computer word of any arbitrary bit-size, also for 64-bit words such as bitboards.

The signed-unsigned "independence" of the two's complement is the reason that processors don't need different add or sub instructions for signed or unsigned integers. The binary pattern of the result is the same, only the interpretation differs and processors flag different overflow- or underflow conditions simultaneously.

Unsigned 64-bit values as used for bitboards have this value range:

       hexadecimal                      decimal    pow2
0x0000000000000000                            0           0
0x0000000000000001                            1           1
..
0x7fffffffffffffff    9,223,372,036,854,775,807    2^63 - 1
0x8000000000000000    9,223,372,036,854,775,808    2^63
..
0xffffffffffffffff   18,446,744,073,709,551,615    2^64 - 1

With signed interpretation, the positive numbers are subset of the unsigned with MSB clear:

       hexadecimal                      decimal    pow2
0x0000000000000000                            0           0
0x0000000000000001                            1           1
..
0x7fffffffffffffff    9,223,372,036,854,775,807    2^63 - 1

Negative numbers have MSB set to one, thus the sign bit interpretation

       hexadecimal                      decimal    pow2
0x8000000000000000   -9,223,372,036,854,775,808  -(2^63)
0x8000000000000001   -9,223,372,036,854,775,807  -(2^63) +1
..
0xfffffffffffffffe                           -2          -2
0xffffffffffffffff                           -1          -1

There is no "negative" zero. What makes the value range of negative values one greater than the positive numbers - and implies that

 -0x8000000000000000 == 0x8000000000000000

Least Significant One

At some point bitboards require serialization, thus isolation of single populated sub-sets which are power of two values if interpreted as number. Dependent on the bitboard-api those values need a further log2(powOfTwo) to convert them into the square index range from 0 to 63. Bitwise boolean operations (and, xor, or) with two's complement or ones' decrement can compute relatives of a set x in several useful ways.

Isolation

The intersection of a none empty bitboard with it's two's complement isolates the LS1B:

LS1B_of_x = x & -x;

With some arbitrary sample set:

      x          &        -x         =     LS1B_of_x
. . . . . . . .     1 1 1 1 1 1 1 1     . . . . . . . .
. . 1 . 1 . . .     1 1 . 1 . 1 1 1     . . . . . . . .
. 1 . . . 1 . .     1 . 1 1 1 . 1 1     . . . . . . . .
. . . . . . . .     1 1 1 1 1 1 1 1     . . . . . . . .
. 1 . . . 1 . .  &  1 . 1 1 1 . 1 1  =  . . . . . . . .
. . 1 . 1 . . .     . . 1 1 . 1 1 1     . . 1 . . . . .
. . . . . . . .     . . . . . . . .     . . . . . . . .
. . . . . . . .     . . . . . . . .     . . . . . . . .

Some C++ compiler warn -x still unsigned - (0-x) may used to avoid that with no overhead.

x86-mnemonics
x86-64 expansion BMI1 has LS1B bit isolation:

blsi  rax, rbx ; BMI1  rax = rbx & -rbx 

BMI1-intrinsic _blsi_u32/64

AMD's x86-64 expansion TBM further has a Isolate Lowest Set Bit and Complement instruction, which applies De Morgan's law to get the complement of the LS1B:

blsic rax, rbx ; TBM:  rax = ~rbx | (rbx - 1);

Reset

The intersection of a none empty bitboard with it's ones' decrement resets the LS1B [14]:

x_with_reset_LS1B = x & (x-1);

With some arbitrary sample set:

      x          &      (x-1)        =  x_with_reset_LS1B
. . . . . . . .     . . . . . . . .     . . . . . . . .
. . 1 . 1 . . .     . . 1 . 1 . . .     . . 1 . 1 . . .
. 1 . . . 1 . .     . 1 . . . 1 . .     . 1 . . . 1 . .
. . . . . . . .     . . . . . . . .     . . . . . . . .
. 1 . . . 1 . .  &  . 1 . . . 1 . .  =  . 1 . . . 1 . .
. . 1 . 1 . . .     1 1 . . 1 . . .     . . . . 1 . . .
. . . . . . . .     1 1 1 1 1 1 1 1     . . . . . . . .
. . . . . . . .     1 1 1 1 1 1 1 1     . . . . . . . .

... since we already know two's complement (-x) and ones' decrement (x-1) are complement sets.

x86-mnemonics
x86-64 expansion BMI1 has LS1B bit reset:

blsr  rax, rbx ; BMI1  rax = rbx & (rbx - 1)

BMI1-intrinsic _blsr_u32/64

Separation

Masks separated by LS1B by xor with two's complement or ones' decrement. Intersection of one's complement with decrement leaves the below mask excluding LS1B:

above_LS1B_mask           =  x ^  -x;
below_LSB1_mask_including =  x ^ (x-1);
below_LSB1_mask           = ~x & (x-1);

With some arbitrary sample set:

      x          ^        -x         =   above_LS1B_mask
. . . . . . . .     1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1
. . 1 . 1 . . .     1 1 . 1 . 1 1 1     1 1 1 1 1 1 1 1
. 1 . . . 1 . .     1 . 1 1 1 . 1 1     1 1 1 1 1 1 1 1
. . . . . . . .     1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1
. 1 . . . 1 . .  ^  1 . 1 1 1 . 1 1  =  1 1 1 1 1 1 1 1
. . 1 . 1 . . .     . . 1 1 . 1 1 1     . . . 1 1 1 1 1
. . . . . . . .     . . . . . . . .     . . . . . . . .
. . . . . . . .     . . . . . . . .     . . . . . . . .

      x          ^      (x-1)        =  below_LSB1_mask_including
. . . . . . . .     . . . . . . . .     . . . . . . . .
. . 1 . 1 . . .     . . 1 . 1 . . .     . . . . . . . .
. 1 . . . 1 . .     . 1 . . . 1 . .     . . . . . . . .
. . . . . . . .     . . . . . . . .     . . . . . . . .
. 1 . . . 1 . .  ^  . 1 . . . 1 . .  =  . . . . . . . .
. . 1 . 1 . . .     1 1 . . 1 . . .     1 1 1 . . . . .
. . . . . . . .     1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1
. . . . . . . .     1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1

     ~x          &      (x-1)        =  below_LSB1_mask
1 1 1 1 1 1 1 1     . . . . . . . .     . . . . . . . .
1 1 . 1 . 1 1 1     . . 1 . 1 . . .     . . . . . . . .
1 . 1 1 1 . 1 1     . 1 . . . 1 . .     . . . . . . . .
1 1 1 1 1 1 1 1     . . . . . . . .     . . . . . . . .
1 . 1 1 1 . 1 1  &  . 1 . . . 1 . .  =  . . . . . . . .
1 1 . 1 . 1 1 1     1 1 . . 1 . . .     1 1 . . . . . .
1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1

x86-mnemonics
x86-64 expansion BMI1 has BLSMSK (Mask Up to Lowest Set Bit = below_LSB1_mask_including), AMD's x86-64 expansion TBM has TZMSK (Mask From Trailing Zeros = below_LSB1_mask):

blsmsk rax, rbx ; BMI1:  rax =  rbx ^ (rbx - 1)
tzmsk  rax, rbx ; TBM:   rax = ~rbx & (rbx - 1)

BMI1-intrinsic _blsmsk_u32/64

Smearing

To smear the LS1B up and down, we use the union with two's complement or ones' decrement:

smearsLS1BUp   = x |  -x;
smearsLS1BDown = x | (x-1);

With some arbitrary sample set:

      x          |        -x         =  smearsLS1BUp
. . . . . . . .     1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1
. . 1 . 1 . . .     1 1 . 1 . 1 1 1     1 1 1 1 1 1 1 1
. 1 . . . 1 . .     1 . 1 1 1 . 1 1     1 1 1 1 1 1 1 1
. . . . . . . .     1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1
. 1 . . . 1 . .  |  1 . 1 1 1 . 1 1  =  1 1 1 1 1 1 1 1
. . 1 . 1 . . .     . . 1 1 . 1 1 1     . . 1 1 1 1 1 1
. . . . . . . .     . . . . . . . .     . . . . . . . .
. . . . . . . .     . . . . . . . .     . . . . . . . .

      x          |      (x-1)        =  smearsLS1BDown
. . . . . . . .     . . . . . . . .     . . . . . . . .
. . 1 . 1 . . .     . . 1 . 1 . . .     . . 1 . 1 . . .
. 1 . . . 1 . .     . 1 . . . 1 . .     . 1 . . . 1 . .
. . . . . . . .     . . . . . . . .     . . . . . . . .
. 1 . . . 1 . .  |  . 1 . . . 1 . .  =  . 1 . . . 1 . .
. . 1 . 1 . . .     1 1 . . 1 . . .     1 1 1 . 1 . . .
. . . . . . . .     1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1
. . . . . . . .     1 1 1 1 1 1 1 1     1 1 1 1 1 1 1 1

x86-mnemonics
AMD's x86-64 expansion TBM has a Fill From Lowest Set Bit instruction:

blsfill  rax, rbx ; TBM:  rax = rbx | (rbx - 1)

Least Significant Zero

Dealing with the least significant zero bit (LS0B) or clear bit can be derived from the complement of the LS1B. AMD's x86-64 expansion TBM has six instructions based on boolean operations with the one's increment:

Most Significant One

The MS1B is not that simple to isolate as long we have no reverse arithmetic with carries propagating from left to right. To isolate MS1B, one needs to set all lower bits below MS1B, shift the resulting mask right by one and finally add one. Setting all lower bits in the general case requires 63 times x |= x >> 1 which might be done in parallel prefix manner in log2(64) = 6 steps:

x |= x >> 32;
x |= x >> 16;
x |= x >>  8;
x |= x >>  4;
x |= x >>  2;
x |= x >>  1;
MS1B = (x >> 1) + 1;

Still quite expensive - better to traverse sets the other way around or rely on intrinsic functions to use special processor instructions like BitScanReverse or LeadingZeroCount, which implicitly performs not only the isolation but also the log2.

Common MS1B
Two sets have a common MS1B, if the intersection is greater than the xor sum:

if ((a & b) > (a ^ b)) -> a and b have common MS1B

This is because a common MS1B is set in the intersection but cleared in the xor sum. Otherwise, with no common MS1B, the xor-sum is greater except equal for two zero operands.

Multiplication

64-bit Multiplication has become awfully fast on recent processors. Shift left is of course still faster than multiplication by power of two, but if we have more than one bit set in a factor, it already makes sense to replace for instance

y  = (x << 8) + (x << 16);

by

y  = x * 0x00010100;

Fill-Multiplication
In fact, we can replace parallel prefix left shifts like,

x |= x << 32;
x |= x << 16;
x |= x <<  8;

where x has max one bit per file, and we can therefor safely replace 'or' by 'add'

x += x << 32;
x += x << 16;
x += x <<  8;

by multiplication with 0x0101010101010101 (which is the A-File in little endian mapping):

. . . . . . . .     1 . . . . . . .     . 1 1 . 1 1 . .
. . . . . . . .     1 . . . . . . .     . 1 1 . 1 1 . .
. . . . . . . .     1 . . . . . . .     . 1 1 . 1 1 . .
. . . . . . . .     1 . . . . . . .     . 1 1 . 1 1 . .
. 1 . . . 1 . .  *  1 . . . . . . .  =  . 1 1 . 1 1 . .
. . 1 . 1 . . .     1 . . . . . . .     . . 1 . 1 . . .
. . . . . . . .     1 . . . . . . .     . . . . . . . .
. . . . . . . .     1 . . . . . . .     . . . . . . . .

See Kindergarten-Bitboards- or Magic-Bitboards as applications of fill-multiplication.

De Bruijn Multiplication
Another bitboard related application of multiplication is to determine the bit-index of the least significant one bit. A isolated, single bit is multiplied with a De Bruijn Sequence to implement a bitscan.

Division

64-bit Division is still a slow instruction which takes a lot of cycles - it should be avoided at runtime. Division by a power of two is done by right shift.

An interesting application to calculate various masks for delta swaps, e.g. swapping bits, bit-duos, nibbles, bytes, words and double words, is the 2-adic division of the universal set (-1) by 2^(2^i) plus one, which may be done at compile time:

-1 / ( 2^(2^0) + 1) == -1 / (         2 + 1) == 0x5555555555555555
-1 / ( 2^(2^1) + 1) == -1 / (         4 + 1) == 0x3333333333333333
-1 / ( 2^(2^2) + 1) == -1 / (        16 + 1) == 0x0f0f0f0f0f0f0f0f
-1 / ( 2^(2^3) + 1) == -1 / (       256 + 1) == 0x00ff00ff00ff00ff
-1 / ( 2^(2^4) + 1) == -1 / (     65536 + 1) == 0x0000ffff0000ffff
-1 / ( 2^(2^5) + 1) == -1 / (4294967296 + 1) == 0x00000000ffffffff

See generalized flipping, mirroring and reversion. Often used masks and factors are the 2-adic division of the universal set (-1) by 2^(2^i) minus one, which results in the lowest bit of SWAR-wise bits set, bit-duos, nibbles, bytes, words and double words:

-1 / ( 2^(2^0) - 1) == -1 / (         2 - 1) == 0xffffffffffffffff
-1 / ( 2^(2^1) - 1) == -1 / (         4 - 1) == 0x5555555555555555
-1 / ( 2^(2^2) - 1) == -1 / (        16 - 1) == 0x1111111111111111
-1 / ( 2^(2^3) - 1) == -1 / (       256 - 1) == 0x0101010101010101
-1 / ( 2^(2^4) - 1) == -1 / (     65536 - 1) == 0x0001000100010001
-1 / ( 2^(2^5) - 1) == -1 / (4294967296 - 1) == 0x0000000100000001

Modulo

Modular arithmetic with 64-bit modulo by a constant, has applications in Cryptography [15], Hashing, and with Bitboards in Bit Scanning, Population Count and Congruent Modulo Bitboards for Sliding Piece Attacks.

Casting out 255

Similar to Casting out nines with decimals and due to the congruence relation

Basen ≡ 1 (mod Base-1)

casting out 255 can be used to add all the eight bytes within a SWAR-wise 64-bit quad word if the sum is less than 255, as mentioned, applicable in Population Count and Congruent Modulo Bitboards - Casting out 255.

Reciprocal Multiplication

Likely 64-bit compiler will optimize modulo (and division) by reciprocal, 2^64 div constant, to perform a 64*64 = 128bit fixed point multiplication to get the quotient in the upper 64-bit, and a second multiplication and subtraction to finally get the remainder. Here some sample x86-64 assembly:

r11d := r10 % 257
 mov    r11d, r10 ; masked diagonal
 mov    rax, ff00ff00ff00ff01H ; 2^(64+8) / 257
 mul    r10
 shr    rdx, 8
 imul   edx, 257 ; 00000101H
 sub    r11d, edx

Power of Two

As a remainder, and to close the cycle to bitwise boolean operations, the well known trick is mentioned, to replace modulo by power of two by intersection with power of two minus one:

a % 2n == a & (2n - 1)

Selected Publications

1847 ...

1900 ...

1950 ...

2000 ...

Forum Posts

2000 ...

2010 ...

2020 ...

External Links

Sets

Naive set theory from Wikipedia
Zermelo–Fraenkel set theory from Wikipedia » Ernst Zermelo, Abraham Fraenkel

Algebra

Logic

Operations

Setwise

Intersection (set theory) from Wikipedia
Union (set theory) from Wikipedia
Complement (set theory) from Wikipedia

Bitwise

Logical conjunction from Wikipedia
Logical disjunction from Wikipedia
Exclusive or from Wikipedia
Negation from Wikipedia
Bit Shifts from Wikipedia
Circular shift from Wikipedia

Arithmetic

Addition from Wikipedia
Subtraction from Wikipedia
Two's complement from Wikipedia
Multiplication from Wikipedia
Division from Wikipedia
Modulo operation from Wikipedia

Modular arithmetic

Misc

References

  1. Wassily Kandinsky - Upward, 1929, Peggy Guggenheim Collection, Wikimedia COmmons
  2. Andrey Ershov, Mikhail R. Shura-Bura (1980). The Early Development of Programming in the USSR. in Nicholas C. Metropolis (ed.) A History of Computing in the Twentieth Century. Academic Press, preprint pp. 43
  3. Lazar A. Lyusternik, Aleksandr A. Abramov, Victor I. Shestakov, Mikhail R. Shura-Bura (1952). Programming for High-Speed Electronic Computers. (Программирование для электронных счетных машин)
  4. John Venn (1880). On the Diagrammatic and Mechanical Representation of Propositions and Reasonings. Philosophical Magazine, Vol. 9, No. 59
  5. Greater or less in the arithmetical sense is usually not relevant with bitboards, but see greater condition in Thor's Hammer's move generation
  6. George Boole (1847). The Mathematical Analysis of Logic, Being an Essay towards a Calculus of Deductive Reasoning. Macmillan, Barclay & Macmillan
  7. Charles S. Peirce (1880). On the Algebra of Logic. American Journal of Mathematics, Vol. 3
  8. Augustus De Morgan (1860). Syllabus of a Proposed System of Logic. Walton & Malbery
  9. Marvin Minsky, Seymour Papert (1969, 1972). Perceptrons: An Introduction to Computational Geometry. The MIT Press, ISBN 0-262-63022-2
  10. Re: Java chess program? by Moritz Berger, rgcc, May 29, 1997 » Shifting Bitboards, Java
  11. To shift or not to shift by thevinenator, OpenChess Forum, September 09, 2015
  12. On the speed of SquareBB array by protonspring, FishCooking, March 22, 2019
  13. Donald Knuth (2009). The Art of Computer Programming, Volume 4, Fascicle 1: Bitwise tricks & techniques, as Pre-Fascicle 1a postscript
  14. Peter Wegner (1960). A technique for counting ones in a binary computer. Communications of the ACM, Volume 3, 1960
  15. Modular exponentiation from Wikipedia

Up one Level