Difference between revisions of "Matt Taylor"
GerdIsenberg (talk | contribs) |
GerdIsenberg (talk | contribs) |
||
Line 85: | Line 85: | ||
<references /> | <references /> | ||
'''[[People|Up one level]]''' | '''[[People|Up one level]]''' | ||
+ | [[Category:Programmer|Taylor]] |
Latest revision as of 16:36, 27 February 2019
Matt Taylor was involved in forum discussions on low level optimization and x86 assembly language issues. Beside other things Matt was busy to optimize a minimal perfect hashing scheme for bitscan purposes.
BitScan
Based on ideas of Walter Faxon and De Bruijn Multiplication, Matt came up with a genius folding trick as a quintessence [1] [2]:
For 64-bits, I do things a little differently simply because a 64-bit multiply is slow. I start out with x ^= (x - 1) just like normal which generates a key equal to 2^n - 1 (where n is the index of the LSB, 1 is index 0). Now fold the 64-bit key into 32-bits by xoring the high 32-bits with the low 32-bits.
ls1b | bb ^ (bb-1) | folded |
---|---|---|
63 | 0xffffffffffffffff
|
0x00000000
|
62 | 0x7fffffffffffffff
|
0x80000000
|
59 | 0x0fffffffffffffff
|
0xf0000000
|
32 | 0x00000001ffffffff
|
0xfffffffe
|
31 | 0x00000000ffffffff
|
0xffffffff
|
30 | 0x000000007fffffff
|
0x7fffffff
|
0 | 0x0000000000000001
|
0x00000001
|
Even if this folded "LS1B" contains multiple consecutive one-bits, the multiplication is De Bruijn like. There are only two magic 32-bit constants with the combined property of 32- and 64-bit De Bruijn Sequences to apply this minimal perfect hashing:
const int lsb_64_table[64] = { 63, 30, 3, 32, 59, 14, 11, 33, 60, 24, 50, 9, 55, 19, 21, 34, 61, 29, 2, 53, 51, 23, 41, 18, 56, 28, 1, 43, 46, 27, 0, 35, 62, 31, 58, 4, 5, 49, 54, 6, 15, 52, 12, 40, 7, 42, 45, 16, 25, 57, 48, 13, 10, 39, 8, 44, 20, 47, 38, 22, 17, 37, 36, 26 }; /** * bitScanForward * @author Matt Taylor (2003) * @param bb bitboard to scan * @precondition bb != 0 * @return index (0..63) of least significant one bit */ int bitScanForward(U64 bb) { unsigned int folded; assert (bb != 0); bb ^= bb - 1; folded = (int) bb ^ (bb >> 32); return lsb_64_table[folded * 0x78291ACF >> 26]; }
Forum Posts
- Bitscan Conclusions by Matt Taylor, CCC, January 05, 2003
- Cleverness, Please! by Matt Taylor, CCC, January 22, 2003
- Bitscan by Matt Taylor, CCC, February 11, 2003
- Bit magic by Matt Taylor, comp.lang.asm.x86, June 26, 2003
- Re: Bit magic by Matt Taylor, comp.lang.asm.x86, June 29, 2003
- Re: Static memory allocation by Matt Taylor, comp.lang.asm.x86, July 03, 2004
References
- ↑ Bit magic by Matt Taylor, comp.lang.asm.x86, June 26, 2003
- ↑ Re: Bit magic by Matt Taylor, comp.lang.asm.x86, June 29, 2003