X86-64

Home * Hardware * x86-64



x86-64 or x64, an 64-bit x86-extension, designed by AMD as Hammer- or K8 architecture with Athlon 64 and Opteron cpus. It has been cloned by Intel under the name EMT64 and later Intel 64. Beside 64-bit general purpose extensions, x86-64 supports MMX-, x87- as well as the 128-bit SSE- and SSE2-instruction sets. According to the CPUID-instructions, further SIMD Streamig Extensions, such as SSE3, SSSE3 (Intel only), SSE4 (Core2, K10), AVX, AVX2 and AVX-512, and AMD's [https://en.wikipedia.org/wiki/3DNow! 3DNow!], Enhanced 3DNow! and XOP.

=Register File= x86-64 doubles the number of x86 general purpose- and XMM registers.

General Purpose
The 16 general purpose registers may be treated as 64 bit Quad Word (bitboard), 32 bit Double Word, 16 bit Word and high (partly), low Byte :

MMX
Eight 64-bit MMX-Registers: MM0 - MM7. Treated as Double, Quad Word or vector of two Floats, Double Words, vector if four Words or eight Bytes.

SSE/SSE*
Sixteen 128-bit XMM-Registers: XMM0 - XMM15. Treated as vector of two Doubles or Quad Words, as vector of four Floats or Double Words, and as vector of eight Words or 16 Bytes.

AVX, AVX2/XOP
Intel Sandy Bridge and AMD Bulldozer Sixteen 256-bit YMM-Registers: YMM0 - YMM15 (shared by XMM as lower half). Treated as vector of four Doubles or Quad Words, as vector of eight Floats or Double Words, and as vector of 15 Words or 32 Bytes.

AVX-512
Intel Xeon Phi (2015) 32 512-bit ZMM-Registers: ZMM0 - ZMM31 Eight vector mask registers

=Instructions= Useful instructions for bitboard-applications are by default not supported by high-level programming languages. Available through (inline) Assembly or compiler intrinsics of various C-Compilers.

Bit-Manipulation

 * ABM
 * BMI1
 * BMI2
 * TBM

SSE2
=Software=

Operating Systems

 * Linux 64
 * Tru64 UNIX
 * BSD
 * Mac OS X
 * Windows 64
 * Solaris

Assembly

 * MASM64
 * GNU Assembler

C-Compiler

 * Microsoft Visual C++
 * Intel-C
 * GCC

=See also=
 * asmFish
 * AMX
 * AVX
 * AVX2
 * AVX-512
 * Bitboards
 * General Setwise Operations
 * BitScan


 * BMI1
 * BMI2
 * Itanium
 * NUMA
 * SIMD and SWAR Techniques
 * SMP
 * SSE2
 * SSE3
 * SSSE3
 * SSE4
 * SSE5
 * x86
 * TBM
 * XOP

=Publications= =Manuals=
 * Georg Hager, Jan Treibig, Gerhard Wellein (2013). The Practitioner's Cookbook for Good Parallel Performance on Multi- and Many-Core Systems. RRZE, SC13, slides as pdf
 * S. Ali Mirsoleimani, Aske Plaat, Jaap van den Herik, Jos Vermaseren (2014). Performance analysis of a 240 thread tournament level MCTS Go program on the Intel Xeon Phi. CoRR abs/1409.4297 » Go
 * S. Ali Mirsoleimani, Aske Plaat, Jaap van den Herik, Jos Vermaseren (2015). Scaling Monte Carlo Tree Search on Intel Xeon Phi. CoRR abs/1507.04383 » Hex, MCTS, Parallel Search

Agner Fog

 * Agner Fog's manuals
 * Agner`s CPU blog by Agner Fog

AMD

 * AMD Tech Docs

Instructions

 * Volume 1: Application Programming (pdf)
 * Volume 2: System Programming (pdf)
 * Volume 3: General-Purpose and System Instructions (pdf)
 * Volume 4: 128-Bit and 256-Bit Media Instructions (pdf)
 * Volume 5: 64-Bit Media and x87 Floating-Point Instructions (pdf)

Optimization Guides

 * Software Optimization Guide for AMD64 Processors (pdf)
 * Software Optimization Guide for AMD Family 15h Processors (pdf)
 * Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™ ccNUMA Multiprocessor Systems (pdf)

Instructions

 * Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2A: Instruction Set Reference, A-M (pdf)
 * Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2B: Instruction Set Reference, N-Z (pdf)
 * Intel-AVX-Programming-Reference (pdf)

Optimization Guides

 * Intel® 64 and IA-32 Architectures Optimization Reference Manual (pdf)

=Forum Posts=

2003 ...

 * IA-64 vs OOOE (attn Taylor, Hyatt) by Tom Kerrigan, CCC, February 11, 2003 » Itanium
 * Opteron NUMA/SMP question by Matthew Hull, CCC, February 09, 2005 » NUMA, SMP
 * core2 popcnt by Frank Phillips, CCC, February 13, 2009 » Population Count

2010 ...

 * Ivy Bridge vs Sandy Bridge for computer chess by Larry Kaufman, CCC, September 15, 2012
 * What is your take on AMD's new processor? by Tano-Urayoan Russi Roman, CCC, October 24, 2012
 * Intel i3 L2 cache by Harm Geert Muller, CCC, January 28, 2014 » Memory
 * Core Port Saturation by Natale Galioto, CCC, April 14, 2014

2015 ...

 * syzygy users (and Ronald) by Robert Hyatt, CCC, September 29, 2016 » BitScan, Population Count
 * New AMD processors by Ingo Althöfer, The Computer-go Archives, March 03, 2017
 * Ryzen and BMI2: Strange behavior and high latencies by DonnieTinyHands, Reddit, March 20, 2017 » AMD, BMI2
 * Is anyone here already using a Ryzen 1800X processor ? by Aloisio Ponti, CCC, March 26, 2017 » AMD
 * Intel CPU performance-loss by security-patch?!? by Stefan Pohl, CCC, January 03, 2018
 * Re: Komodo 11.3 by Mark Lefler, CCC, March 04, 2018 » AMD, BMI2 PEXT, Komodo 11.3
 * Some x64 assembler for the curious by Michael Sherwin, CCC, March 22, 2019 » Assembly
 * Ryzen problems - AGAIN! by noobpwnftw, CCC, October 22, 2019

2020 ...

 * Intel AMX with TMUL on Xeon Sapphire Rapids (2021?) by Srdja Matovic, CCC, July 05, 2020 » AMX

=External Links=
 * x86-64 from Wikipedia
 * x86-64 calling conventions from Wikipedia
 * x86 Addressing modes from Wikipedia
 * X32 ABI from Wikipedia
 * Stack frame layout on x86-64 from Eli Bendersky's website, September 06, 2011 » Stack
 * Introduction to x64 Assembly by Chris Lomont, March 2012

AMD

 * AMD K8 from Wikipedia
 * Athlon 64
 * Athlon 64 FX
 * Opteron
 * Athlon 64 X2 dual-core
 * Turion 64 X2 dual-core
 * Inside AMD's Hammer: the 64-bit architecture behind the Opteron and Athlon 64 by Jon Stokes, ars technica, February 01, 2005
 * Understanding the detailed Architecture of AMD's 64 bit Core by Hans de Vries, September 21, 2003
 * AMD K8 from 7-Zip LZMA Benchmark
 * AMD K9 from Wikipedia
 * AMD 10h from Wikipedia
 * AMD K10 (Phenom) from 7-Zip LZMA Benchmark
 * Phenom triple-core, quad-core
 * Bobcat (microarchitecture) from Wikipedia
 * Bulldozer (microarchitecture) from Wikipedia
 * Piledriver (microarchitecture) from Wikipedia
 * Steamroller (microarchitecture) from Wikipedia
 * Excavator (microarchitecture) from Wikipedia
 * Zen (microarchitecture) from Wikipedia

Intel

 * EMT64 from Wikipedia
 * Tick-Tock model from Wikipedia
 * Intel Core (microarchitecture from Wikipedia
 * Intel Atom from Wikipedia
 * Nehalem (microarchitecture) from Wikipedia
 * Sandy Bridge (microarchitecture) from Wikipedia
 * Intel Sandy Bridge from 7-Zip LZMA Benchmark
 * Ivy Bridge (microarchitecture) from Wikipedia
 * Intel Ivy Bridge from 7-Zip LZMA Benchmark
 * Haswell (microarchitecture) from Wikipedia
 * Intel Haswell from 7-Zip LZMA Benchmark
 * Intel's Haswell CPU Microarchitecture by David Kanter, November 13, 2012
 * Broadwell (microarchitecture) from Wikipedia
 * Skylake (microarchitecture) from Wikipedia
 * Kaby Lake from Wikipedia
 * Xeon Phi from Wikipedia

Instruction Sets

 * x87 from Wikipedia
 * MMX from Wikipedia
 * 3DNow! from Wikipedia
 * Streaming SIMD Extensions from Wikipedia
 * SSE2 from Wikipedia » SSE2
 * SSE3 from Wikipedia » SSE3
 * SSSE3 from Wikipedia » SSSE3
 * SSE4 from Wikipedia » SSE4
 * SSE4a from Wikipedia
 * SSE5 from Wikipedia » SSE5
 * XOP instruction set from Wikipedia » XOP
 * Advanced Vector Extensions (AVX) from Wikipedia » AVX
 * AVX-512 from Wikipedia » AVX-512


 * Transactional Synchronization Extensions (TSX) from Wikipedia (Haswell)
 * Intel Intrinsics Guide
 * Advanced Matrix Extension (AMX) - x86 - WikiChip

Security Vulnerability

 * Meltdown (security vulnerability) from Wikipedia
 * Spectre (security vulnerability) from Wikipedia
 * Project Zero: Reading privileged memory with a side-channel by Jann Horn, Project Zero, January 03, 2018

=References=

Up one Level