Changes

← Older edit

SIMD and SWAR Techniques

596 bytes added, 22:44, 17 May 2023

m

→‎SIMD Instruction Sets: fix in ref

* [[SSE2]], [[SSE3]], [[SSSE3]] and [[SSE4]] on [[x86]] and [[x86-64]]

* [[SSE5]] by [[AMD]] (proposed but not implemented, replaced by [[XOP]] <ref>[https://en.wikipedia.org/wiki/SSE5 SSE5 from Wikipedia]</ref>)

* [[AltiVec]] on [[PowerPC#G4|PowerPC G4]], [[PowerPC#G5|PowerPC G5]] resp. VMX since [[POWER | POWER6]]* [https://en.wikipedia.org/wiki/AltiVec#VSX_(Vector_Scalar_Extension) VSX] since [[POWER | POWER7]]* [[Helium]] by [[ARM ]]* [[NEON]] by [[ARM]]* [[SVE]] <ref>[https://en.wikipedia.org/wiki/AArch64#Scalable_Vector_Extension_(SVE) SVE from Wikipedia]</ref> and [[SVE2]] <ref>[https://en.wikipedia.org/wiki/SVE SVE2 from Wikipedia]</ref> by [[ARM ~~Helium~~]]

* [[AVX]] by [[Intel]]

* [[AVX2]] by [[Intel]]

* [[AVX-512]] by [[Intel]]

* [[XOP]] by [[AMD]]

* [[VIS]] <ref>[https://en.wikipedia.org/wiki/Visual_Instruction_Set VIS from Wikipedia]</ref> since [[SPARC]] v9

* [[RISC-V]] vector-set extension <ref>[https://en.wikipedia.org/wiki/RISC-V#Vector_set RISC-V vector-set from Wikipedia]</ref>

=SWAR Arithmetic=

To apply addition and subtraction on vectors of bit-aggregates or [https://en.wikipedia.org/wiki/Bit_field bit-field structures] within a general purpose register, one has to take care carries and borrows don't wrap around. Thus the need to mask of all most significant bits (H) and add in two steps, one 'add' with MSB clear and one add modulo 2 aka '[[General Setwise Operations#ExclusiveOr|xor]]' for the MSB itself. For bytewise (rankwise) math inside a 64-bit register, H is 0x8080808080808080 and L is 0x0101010101010101.

Smatovic

422

edits

Changes

SIMD and SWAR Techniques

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools