Changes

Jump to: navigation, search

SIMD and SWAR Techniques

265 bytes added, 12:41, 18 November 2022
m
SIMD Instruction Sets: added POWER and SVE/SVE2
* [[SSE5]] by [[AMD]] (proposed but not implemented, replaced by [[XOP]] <ref>[https://en.wikipedia.org/wiki/SSE5 SSE5 from Wikipedia]</ref>)
* [[AltiVec]] on [[PowerPC#G4|PowerPC G4]], [[PowerPC#G5|PowerPC G5]]
* [[VMX]] since [[POWER | POWER6]]
* [[ARM Helium]]
* [[ARM NEON]]
* [[ARM HeliumSVE]]<ref>[https://en.wikipedia.org/wiki/AArch64#Scalable_Vector_Extension_(SVE) SVE from Wikipedia]</ref>, [[ARM SVE2]] <ref>[https://en.wikipedia.org/wiki/AArch64#ARMv8.5-A_and_ARMv9.0-A[24] SVE2 from Wikipedia]</ref>
* [[AVX]] by [[Intel]]
* [[AVX2]] by [[Intel]]
* [[XOP]] by [[AMD]]
<span id="SWAR"></span>
 
=SWAR Arithmetic=
To apply addition and subtraction on vectors of bit-aggregates or [https://en.wikipedia.org/wiki/Bit_field bit-field structures] within a general purpose register, one has to take care carries and borrows don't wrap around. Thus the need to mask of all most significant bits (H) and add in two steps, one 'add' with MSB clear and one add modulo 2 aka '[[General Setwise Operations#ExclusiveOr|xor]]' for the MSB itself. For bytewise (rankwise) math inside a 64-bit register, H is <span style="background-color: #e3e3e3;">0x8080808080808080</span> and L is <span style="background-color: #e3e3e3;">0x0101010101010101</span>.
422
edits

Navigation menu