Changes

Jump to: navigation, search

BitScan

No change in size, 10:05, 23 May 2018
no edit summary
<span id="DeBruijnMultiplation"></span>
==De Bruijn Multiplication==
The '''De Bruijn''' bitscan was devised in 1997, according to [[Donald Knuth]] <ref>[[Donald Knuth]] ('''2009'''). ''[http://www-cs-faculty.stanford.edu/%7Eknuth/taocp.html The Art of Computer Programming], Volume 4, Fascicle 1: Bitwise tricks & techniques'', as [http://www-cs-faculty.stanford.edu/%7Eknuth/fasc1a.ps.gz Pre-Fascicle 1a postscript], p 10</ref> by [[Mathematician#MartinLaeuter|Martin Läuter]], and independently by [[Charles Leiserson]], [[Harald Prokop]] and [[Keith H. Randall]] a few month later <ref>[[Charles Leiserson|Charles E. Leiserson]], [[Harald Prokop]] and [[Keith H. Randall]] ('''1998'''). ''Using de Bruijn Sequences to Index a 1 in a Computer Word'', [http://supertech.csail.mit.edu/papers/debruijn.pdf pdf]</ref> <ref>[https://www.stmintz.com/ccc/index.php?id=212586 "Using de Bruijn Sequences to Index a 1 in a Computer Word"] discussion in [[CCC]], February 08, 2002</ref> , to determine the [[General Setwise Operations#TheLeastSignificantOneBitLS1B|LS1B]] index by [[Hash Table#MinimalPerfectHashing|minimal perfect hashing]]. [[De Bruijn sequenceSequence|De Bruijn sequences]] were named after the Dutch mathematician [[Nicolaas de Bruijn]]. Interestingly sequences with the binary alphabet were already investigated by the French mathematician '''Camille Flye Sainte-Marie''' in 1894, but later "forgotten" and re-investigated and generalized by De Bruijn and [[Mathematician#Ehrenfest|Tanja van Ardenne-Ehrenfest]] half a century later <ref>[[Nicolaas de Bruijn|N. G. de Bruijn]] ('''1975'''). ''Acknowledgement of priority to C. Flye Sainte-Marie on the counting of circular arrangements of 2n zeros and ones that show each n-letter word exactly once''. Technical Report, Technische Hogeschool Eindhoven, available as [http://alexandria.tue.nl/repository/books/252901.pdf pdf reprint]</ref> .
A 64-bit De Bruijn sequence Sequence contains 64-overlapped unique 6-bit sequences, thus a circle of 64 bits, where five leading zeros overlap five hidden "trailing" zeros. There are 2<span style="vertical-align: super;
font-size: 80%;">26</span> = 67108864 odd sequences with 6 leading binary zeros and 2<span style="vertical-align: super;
font-size: 80%;">26</span> even sequences with 5 leading binary zeros, which may be calculated from the odd ones by shifting left one.
===With isolated LS1B===
A multiplication with a power of two value (the [[General Setwise Operations#LS1BIsolation|isolated LS1B]]) acts like a left shift by it's exponent. Thus, if we multiply a 64-bit De Bruijn sequence Sequence with the isolated LS1B, we get a unique six bit subsequence inside the most significant bits. To obtain the bit-index we need to extract these upper six bits by shifting right the product, to lookup an [[Array|array]].
<pre>
===With separated LS1B===
Instead of the classical [[General Setwise Operations#LS1BIsolation|LS1B isolation]], [[Kim Walisch]] proposed the faster [[General Setwise Operations#ExclusiveOr|xor]] with the ones' decrement. The separation [[General Setwise Operations#LS1BSeparation|bb ^ (bb-1)]] contains all bits set including and below the [[General Setwise Operations#TheLeastSignificantOneBitLS1B|LS1B]]. The 2<span style="vertical-align: super;
font-size: 80%;">22</span> (4,194,304) upper De Bruijn sequences Sequences of the 2<span style="vertical-align: super;
font-size: 80%;">26</span> available leave unique 6-bit indices. Using LS1B separation takes advantage of the x86 lea instruction, which saves the move instruction and unlike negate, has no data dependency on the flag register. Kim reported a 10 to 15 percent faster execution (compilers: g++-4.7 -O2, clang++-3.1 -O2, x86_64) than the traditional 64-bit De Bruijn bitscan on [[Intel]] [https://en.wikipedia.org/wiki/Nehalem_%28microarchitecture%29 Nehalem] and [https://en.wikipedia.org/wiki/Sandy_Bridge_%28microarchitecture%29 Sandy Bridge] CPUs.
|}
Even if this folded "LS1B" contains multiple consecutive one-bits, the multiplication is De Bruijn like. There are only two magic 32-bit constants with the combined property of 32- and 64-bit De Bruijn sequences Sequences to apply this [[Hash Table#MinimalPerfectHashing|minimal perfect hashing]]:
<pre>
<span id="FillDeBruijn"></span>
==De Bruijn Multiplication==
While the [[BitScan#FrankZappa|tribute]] to [[Videos#FrankZappa|Frank Zappa]] is quite 32-bit friendly <ref>[https://www.stmintz.com/ccc/index.php?id=472762 final version - homage to FZ] by [[Gerd Isenberg]], [[CCC]], December 23, 2005</ref>, [[Kim Walisch]] suggested to use the [[Parallel Prefix Algorithms|parallel prefix fill]] for a [[General Setwise Operations#TheMostSignificantOneBitMS1B|MS1B]] separation with the same [[De Bruijn sequenceSequence|De Bruijn]] multiplication and lookup as in his [[BitScan#KimWalisch|bitScanForward]] routine with [[General Setwise Operations#LS1BSeparation|separated LS1B]], with less instructions in 64-bit mode. A log base 2 method was already devised by Eric Cole on January 8, 2006, and shaved off rounded up to one less than the next power of 2 by Mark Dickinson <ref>[https://ep2012.europython.eu/conference/p/mark-dickinson EuroPython 2012: Florence, July 2–8 | Mark Dickinson]</ref> on December 10, 2009, as published in Sean Eron Anderson's ''Bit Twiddling Hacks'' for 32-bit integers <ref>[http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogDeBruijn Find the log base 2 of an N-bit integer in O(lg(N)) operations with multiply and lookup] from [http://graphics.stanford.edu/%7Eseander/bithacks.html Bit Twiddling Hacks] by [http://graphics.stanford.edu/%7Eseander/ Sean Eron Anderson]</ref>.
<pre>
const int index64[64] = {

Navigation menu