Changes

Jump to: navigation, search

BMI1

5,259 bytes added, 12:03, 24 October 2018
Created page with "'''Home * Hardware * x86-64 * BMI1''' '''BMI1''' (BMI),<br/> an x86-64 expansion of bit-manipulation instructions by [..."
'''[[Main Page|Home]] * [[Hardware]] * [[x86-64]] * BMI1'''

'''BMI1''' (BMI),<br/>
an x86-64 expansion of [[Bit-Twiddling#BitManipulation|bit-manipulation]] instructions by [[Intel]], introduced in conjunction with the [[AVX|Advanced Vector Extensions]] [[SIMD and SWAR Techniques|SIMD]] instruction set. With the [https://en.wikipedia.org/wiki/Bulldozer_%28microarchitecture%29 Bulldozer microarchitecture], BMI1 as well as [[AVX]] are also available on [[AMD]] processors under the initial name BMI, along with their [[TBM|Trailing Bit Manipulation Instructions]] (TBM) <ref>[https://www.amd.com/system/files/TechDocs/24594.pdf AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System Instructions] (pdf)</ref>. Most BMI1 instructions (except LZCNT and TZCNT) employ the [https://en.wikipedia.org/wiki/VEX_prefix VEX prefix] encoding to support up to three-operand syntax with non-destructive source operands on 32- or 64-bit general-purpose registers. BMI1 (ANDN, BEXTR, BLSI, BLSMK, BLSR, TZCNT) requires bit 3 set in EBX of [https://en.wikipedia.org/wiki/CPUID CPUID] with EAX=07H, ECX=0H. LZCNT, not exactly member of BMI1, requires bit 5 set in ECX of CPUID EAX=80000001H. With the advent of [[AVX2]], some more [[Bit-Twiddling|bit-twiddling]] on general-purpose registers is proposed with [[BMI2]].

=Instructions=
BMI1 instructions may speedup various [[Bitboards|bitboard]] [[General Setwise Operations|operations]], such as [[General Setwise Operations#RelativeComplement|relative complement]], and [[General Setwise Operations#LS1BIsolation|isolation]], [[General Setwise Operations#LS1BReset|reset]] and [[General Setwise Operations#LS1BSeparation|separation]] of the [[General Setwise Operations#TheLeastSignificantOneBitLS1B|least significant one bit]], they combine two instructions and reduce register pressure. [[BitScan#LeadingZeroCount|Leading]] and [[BitScan#TrailingZeroCount|trailing zero count]] are useful for [[BitScan|scanning bits]] with possibly empty sets.
<span id="ANDN"></span>
==ANDN==
Logical And Not, the [[General Setwise Operations#RelativeComplement|relative complement]], no intrinsic due to compiler support.
<pre>
dest ::= ~src1 & src2;
</pre>
<span id="BEXTR"></span>
==BEXTR==
Bit Field Extract. Nice to extract some consecutive bits from a ([[Rotated Bitboards|rotated]]) [[Occupancy|occupancy]] bitboard, or, as they name suggests, from [https://en.wikipedia.org/wiki/Bit_field bit-field] structures.
<pre>
dest ::= (src >> start) & ((1 << len)-1);

unsigned __int32 _bextr_u32(unsigned __int32 src, unsigned __int32 start, unsigned __int32 len);
unsigned __int64 _bextr_u64(unsigned __int64 src, unsigned __int32 start, unsigned __int32 len);
</pre>
A shiftless [[Score#SignExtension|sign extension]] might be applied by <ref>[http://aggregate.org/MAGIC/#Sign%20Extension Sign Extension] from [http://aggregate.org/MAGIC The Aggregate Magic Algorithms] by [[Hank Dietz]]</ref>:
<pre>
dest_signextended ::= (dest ^ signbit) - signbit
</pre>
<span id="BLSI"></span>
==BLSI==
Extract Lowest Set Isolated Bit, [[General Setwise Operations#LS1BIsolation|isolates]] [[General Setwise Operations#TheLeastSignificantOneBitLS1B|least significant one bit]].
<pre>
dest ::= src & -src;

unsigned __int64 _blsi_u64(unsigned __int64 src);
</pre>
<span id="BLSMSK"></span>
==BLSMSK==
Get Mask Up to Lowest Set Bit, [[General Setwise Operations#LS1BSeparation|sets all bits below]] the [[General Setwise Operations#TheLeastSignificantOneBitLS1B|least significant one bit]], and clears all upper bits.
<pre>
dest ::= (src-1) ^ src;

unsigned __int64 _blsmsk_u64(unsigned __int64 src);
</pre>
<span id="BLSR"></span>
==BLSR==
Reset Lowest Set Bit, [[General Setwise Operations#LS1BReset|resets]] [[General Setwise Operations#TheLeastSignificantOneBitLS1B|least significant one bit]].
<pre>
dest ::= (src-1) & src;

unsigned __int64 _blsr_u64(unsigned __int64 src);
</pre>
<span id="LZCNT"></span>
==LZCNT==
Count the Number of Leading Zero Bits, initially from [[AMD|AMD's]] [[SSE4#SSE4a|SSE4a]] aka [[SSE4#ABM|Advanced Bit Manipulations]] (ABM).
<pre>
unsigned __int64 _lzcnt_u64(unsigned __int64 src);
</pre>
<span id="TZCNT"></span>
==TZCNT==
Count the Number of Trailing Zero Bits.
<pre>
unsigned __int64 _tzcnt_u64(unsigned __int64 src);
</pre>

=See also=
* [[SSE4#ABM|ABM]]
* [[AVX]]
* [[AVX2]]
* [[BitScan]]
* [[Bit-Twiddling]]
* [[BMI2]]
* [[General Setwise Operations]]
* [[TBM]]

=Manuals=
* [http://software.intel.com/file/36945 Intel AVX and AVX2 Programming Reference] (pdf)
* [https://www.amd.com/system/files/TechDocs/24594.pdf AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System Instructions] (pdf) <ref>Moved BMI and TBM instructions from Volume 4 to Volume 3 in September 2011</ref>
* [https://www.amd.com/system/files/TechDocs/47414_15h_sw_opt_guide.pdf Software Optimization Guide for AMD Family 15h Processors] (pdf) 9.8 Optimizing with BMI and TBM Instructions, pp. 163

=External Links=
* [https://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets Bit Manipulation Instruction Sets from Wikipedia]
* [https://software.intel.com/sites/landingpage/IntrinsicsGuide/# Intel Intrinsics Guide]

=References=
<references />
'''[[x86-64|Up one Level]]'''

Navigation menu