SSE4

Home * Hardware * x86 * SSE4

SSE4 is a set of Intel and AMD ambiguous and almost disjoint x86 instruction set extensions, SSE4.1, SSE4.2 both by Intel, and SSE4a by AMD.

=Intel=

SSE4.1
Intel introduced SSE4.1 with the Penryn Core 2 brand of the Core microarchitecture in 2007 with 47 new instructions.

see Vulnerable on distant Checks with SSE4.

SSE4.2
SSE4.2 of the Nehalem-based Core i7 was introduced in 2008 with 7 new instructions.

STTNI
SSE4.2 includes five String and Text New Instructions (STTNI) working on 128-bit XMM SIMD as well as general prupose registers and flags to perform character searches and comparison on two operands of 16 bytes at a time, i.e. PCMPESTRI (Packed Compare Explicit Length Strings, Return Index).

ATAI
Popcnt and crc32, working on general purpose registers, were dubbed Application-Targeted Accelerator Instructions (ATAI) as subset of SSE4.2, but should considered as disjoint instruction set concerning SSE4 compiler optimizations.

=AMD SSE4a= SSE4a was introduced by AMD with the K10 (Barcelona) microarchitecture.

SIMD
Two new SIMD instructions, working on XMM registers were combined mask-shift instructions (EXTRQ/INSERTQ) and scalar streaming store instructions (MOVNTSD/MOVNTSS). These instructions are not available in Intel's SSE4.

Advanced Bit Manipulation
The two important instructions work on general purpose registers. Leading Zero Count was not available in Intel's Application-Targeted Accelerator Instructions of SSE4.2, but later incorporated with BMI.

=See also=
 * AltiVec
 * AVX
 * BMI
 * MMX
 * SIMD and SWAR Techniques
 * SSE
 * SSE2
 * SSE3
 * SSSE3
 * SSE5
 * TBM
 * Vulnerable on distant Checks with SSE4
 * XOP

=Manuals=
 * Intel® SSE4 Programming Reference (pdf)
 * Software Optimization Guide for AMD Family 10h and 12h Processors (pdf)

=Forum Posts=
 * using Popcount and Prefetch with SSE4 hardware support by Engin Üstün, CCC, May 19, 2012 » Population Count, Memory

=External Links=
 * SSE4 from Wikipedia
 * MSDN - Streaming SIMD Extensions 4 Instructions
 * MSDN - SSE4A and Advanced Bit Manipulation Intrinsics
 * SSEPlus Project Documentation
 * Agner`s CPU blog by Agner Fog
 * Intel Intrinsics Guide

=References=

Up one Level