Difference between revisions of "Matt Taylor"
GerdIsenberg (talk | contribs) (Created page with "'''Home * People * Matt Taylor''' '''Matt Taylor''' was involved in forum discussions on low level optimization and x86 assembly language i...") |
GerdIsenberg (talk | contribs) |
||
(One intermediate revision by the same user not shown) | |||
Line 5: | Line 5: | ||
=BitScan= | =BitScan= | ||
− | Based on ideas of [[Walter Faxon]] and [[BitScan#DeBruijnMultiplation|De Bruijn Multiplication]], Matt came up with a [[BitScan#MattTaylorsFoldingtrick|genius folding trick]] as a quintessence <ref>[ | + | Based on ideas of [[Walter Faxon]] and [[BitScan#DeBruijnMultiplation|De Bruijn Multiplication]], Matt came up with a [[BitScan#MattTaylorsFoldingtrick|genius folding trick]] as a quintessence <ref>[https://groups.google.com/d/msg/comp.lang.asm.x86/3pVGzQGb1ys/fPpKBKNi848J Bit magic] by [[Matt Taylor]], [https://groups.google.com/forum/#!forum/comp.lang.asm.x86 comp.lang.asm.x86], June 26, 2003</ref> <ref>[https://groups.google.com/d/msg/comp.lang.asm.x86/3pVGzQGb1ys/230qffQJYvQJ Re: Bit magic] by [[Matt Taylor]], [https://groups.google.com/forum/#!forum/comp.lang.asm.x86 comp.lang.asm.x86], June 29, 2003</ref>: |
+ | For 64-bits, I do things a little differently simply because a 64-bit multiply is slow. I start out with x ^= (x - 1) just like normal which generates a key equal to 2^n - 1 (where n is the index of the LSB, 1 is index 0). Now fold the 64-bit key into 32-bits by xoring the high 32-bits with the low 32-bits. | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! ls1b | ||
+ | ! bb ^ (bb-1) | ||
+ | ! folded | ||
+ | |- | ||
+ | | style="text-align:center;" | 63 | ||
+ | | <code>0xffffffffffffffff</code> | ||
+ | | <code>0x00000000</code> | ||
+ | |- | ||
+ | | style="text-align:center;" | 62 | ||
+ | | <code>0x7fffffffffffffff</code> | ||
+ | | <code>0x80000000</code> | ||
+ | |- | ||
+ | | style="text-align:center;" | 59 | ||
+ | | <code>0x0fffffffffffffff</code> | ||
+ | | <code>0xf0000000</code> | ||
+ | |- | ||
+ | | style="text-align:center;" | 32 | ||
+ | | <code>0x00000001ffffffff</code> | ||
+ | | <code>0xfffffffe</code> | ||
+ | |- | ||
+ | | style="text-align:center;" | 31 | ||
+ | | <code>0x00000000ffffffff</code> | ||
+ | | <code>0xffffffff</code> | ||
+ | |- | ||
+ | | style="text-align:center;" | 30 | ||
+ | | <code>0x000000007fffffff</code> | ||
+ | | <code>0x7fffffff</code> | ||
+ | |- | ||
+ | | style="text-align:center;" | 0 | ||
+ | | <code>0x0000000000000001</code> | ||
+ | | <code>0x00000001</code> | ||
+ | |} | ||
+ | |||
+ | Even if this folded "LS1B" contains multiple consecutive one-bits, the multiplication is De Bruijn like. There are only two magic 32-bit constants with the combined property of 32- and 64-bit De Bruijn Sequences to apply this [[Hash Table#MinimalPerfectHashing|minimal perfect hashing]]: | ||
+ | |||
<pre> | <pre> | ||
const int lsb_64_table[64] = | const int lsb_64_table[64] = | ||
Line 39: | Line 78: | ||
* [https://www.stmintz.com/ccc/index.php?id=278848 Cleverness, Please!] by [[Matt Taylor]], [[CCC]], January 22, 2003 | * [https://www.stmintz.com/ccc/index.php?id=278848 Cleverness, Please!] by [[Matt Taylor]], [[CCC]], January 22, 2003 | ||
* [https://www.stmintz.com/ccc/index.php?id=283655 Bitscan] by [[Matt Taylor]], [[CCC]], February 11, 2003 | * [https://www.stmintz.com/ccc/index.php?id=283655 Bitscan] by [[Matt Taylor]], [[CCC]], February 11, 2003 | ||
− | * [ | + | * [https://groups.google.com/d/msg/comp.lang.asm.x86/3pVGzQGb1ys/fPpKBKNi848J Bit magic] by [[Matt Taylor]], [https://groups.google.com/forum/#!forum/comp.lang.asm.x86 comp.lang.asm.x86], June 26, 2003 |
− | * [ | + | : [https://groups.google.com/d/msg/comp.lang.asm.x86/3pVGzQGb1ys/230qffQJYvQJ Re: Bit magic] by [[Matt Taylor]], [https://groups.google.com/forum/#!forum/comp.lang.asm.x86 comp.lang.asm.x86], June 29, 2003 |
+ | * [https://groups.google.com/d/msg/comp.lang.asm.x86/Ge1Uzd0WckM/zR6WLJRixiYJ Re: Static memory allocation] by [[Matt Taylor]], [https://groups.google.com/forum/#!forum/comp.lang.asm.x86 comp.lang.asm.x86], July 03, 2004 | ||
=References= | =References= | ||
<references /> | <references /> | ||
'''[[People|Up one level]]''' | '''[[People|Up one level]]''' | ||
+ | [[Category:Programmer|Taylor]] |
Latest revision as of 16:36, 27 February 2019
Matt Taylor was involved in forum discussions on low level optimization and x86 assembly language issues. Beside other things Matt was busy to optimize a minimal perfect hashing scheme for bitscan purposes.
BitScan
Based on ideas of Walter Faxon and De Bruijn Multiplication, Matt came up with a genius folding trick as a quintessence [1] [2]:
For 64-bits, I do things a little differently simply because a 64-bit multiply is slow. I start out with x ^= (x - 1) just like normal which generates a key equal to 2^n - 1 (where n is the index of the LSB, 1 is index 0). Now fold the 64-bit key into 32-bits by xoring the high 32-bits with the low 32-bits.
ls1b | bb ^ (bb-1) | folded |
---|---|---|
63 | 0xffffffffffffffff
|
0x00000000
|
62 | 0x7fffffffffffffff
|
0x80000000
|
59 | 0x0fffffffffffffff
|
0xf0000000
|
32 | 0x00000001ffffffff
|
0xfffffffe
|
31 | 0x00000000ffffffff
|
0xffffffff
|
30 | 0x000000007fffffff
|
0x7fffffff
|
0 | 0x0000000000000001
|
0x00000001
|
Even if this folded "LS1B" contains multiple consecutive one-bits, the multiplication is De Bruijn like. There are only two magic 32-bit constants with the combined property of 32- and 64-bit De Bruijn Sequences to apply this minimal perfect hashing:
const int lsb_64_table[64] = { 63, 30, 3, 32, 59, 14, 11, 33, 60, 24, 50, 9, 55, 19, 21, 34, 61, 29, 2, 53, 51, 23, 41, 18, 56, 28, 1, 43, 46, 27, 0, 35, 62, 31, 58, 4, 5, 49, 54, 6, 15, 52, 12, 40, 7, 42, 45, 16, 25, 57, 48, 13, 10, 39, 8, 44, 20, 47, 38, 22, 17, 37, 36, 26 }; /** * bitScanForward * @author Matt Taylor (2003) * @param bb bitboard to scan * @precondition bb != 0 * @return index (0..63) of least significant one bit */ int bitScanForward(U64 bb) { unsigned int folded; assert (bb != 0); bb ^= bb - 1; folded = (int) bb ^ (bb >> 32); return lsb_64_table[folded * 0x78291ACF >> 26]; }
Forum Posts
- Bitscan Conclusions by Matt Taylor, CCC, January 05, 2003
- Cleverness, Please! by Matt Taylor, CCC, January 22, 2003
- Bitscan by Matt Taylor, CCC, February 11, 2003
- Bit magic by Matt Taylor, comp.lang.asm.x86, June 26, 2003
- Re: Bit magic by Matt Taylor, comp.lang.asm.x86, June 29, 2003
- Re: Static memory allocation by Matt Taylor, comp.lang.asm.x86, July 03, 2004
References
- ↑ Bit magic by Matt Taylor, comp.lang.asm.x86, June 26, 2003
- ↑ Re: Bit magic by Matt Taylor, comp.lang.asm.x86, June 29, 2003