Changes

Jump to: navigation, search

GPU

111 bytes removed, 11:41, 28 November 2020
m
Integer Instruction Throughput
* INT8
: Some architectures like Nvidia AMD [https://en.wikipedia.org/wiki/Turing_(microarchitecture) TuringAMD_RX_Vega_series Vega] and AMD or Intel [https://en.wikipedia.org/wiki/AMD_RX_Vega_series VegaIntel_Xe Xe] offer higher throughput with lower precision. Vega doubles They double the [https://en.wikipedia.org/wiki/FP16 FP16] and quadruples quadruple the [https://en.wikipedia.org/wiki/Integer_(computer_science)#Common_integral_data_types INT8] throughput.<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next#fifth Vega (GCN 5th generation) from Wikipedia]</ref>Turing doubles the FP16 throughput of its [https://en.wikipedia.org/wiki/Floating-point_unit FPUs].<ref>[https://www.anandtechservethehome.com/showintel-xe-sg1-hp-and-dg1-at-architecture-day-2020/13282/nvidiaintel-architecture-day-2020-turingxe-architecturelp-deepint8-diveincrease/4 AnandTech xe- Nvidia Turing Deep Dive page 4lp-int8 from servethehome]</ref>
==Floating Point Instruction Throughput==
==Tensors==
===Nvidia TensorCores===
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer fp16*fp16FP16xFP16+fp32FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for bfloat16BF16, TensorFloat-32 (TF32), FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>
===AMD Matrix Cores===
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA, Matrixmatrix-Fusedfused-Multiplymultiply-Addadd, operations on various data types like INT8, FP16, BF16, FP32.
===Intel XMX Cores===
422
edits

Navigation menu