Changes

GPU

731 bytes added, 15:52, 16 April 2021

→‎Architectures and Physical Hardware: added Intel and ARM

= Architectures and Physical Hardware =

The market is split into ~~three~~ two categories~~: server~~, ~~professional~~integrated and discrete GPUs. The first being the most important by quantity, ~~and consumer~~the second by performance. ~~Consumer cards~~ Discrete GPUs are ~~cheapest and are primarily targeted~~ divided as consumer brands for ~~the video game market. Professional cards have better driver support~~ playing 3D games, professional brands for 3d 3D CAD/CGI programs ~~like Autocad~~and and server brands for big-data and number-crunching workloads. ~~Finally~~Each brand offering different feature sets in drivers, ~~server cards provide virtualization services~~VRAM, ~~allowing cloud companies to virtually split their cards between customers~~or computation abilities.

~~Consumer class GPUs cost anywhere~~ == ARM Mali ==The Mali GPU variants can be found on various systems on chips (SoCs) from ~~$100 to $1000~~different vendors. ~~Professional cards can run to $2000, while server class cards can cost as much as $10,000~~Since Midgard (2012) with unified-shader-model OpenCL support is offered.

=== Bifrost (2016) and Valhall (2019) === * [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide] === Midgard (2012) ===* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide] == AMD ==AMD line of discrete GPUs ~~use high~~is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server. === CDNA === CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020. * [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper] === Navi 2X RDNA 2.0 === * [https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA 2 from Wikipedua]* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture] RDNA 2.0 cards were unveiled on October 28, 2020. * [https://en.wikipedia.org/wiki/Radeon_RX_6000_series Radeon RX 6000 series from Wikipedua] === Navi RDNA 1.0 === * [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]* [https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA (microarchitecture) from Wikipedua] RDNA cards were first released in 2019. RDNA is a major change for AMD cards: the underlying hardware supports both Wave32 and Wave64 gangs of threads. Compute Units have 2x32 wide SIMD units, each of which executes 32 threads per clock tick. A Wave64 workgroup will execute on a single SIMD unit, but over two clock ticks. It should be noted that these Wave32 still have 5 cycles of latency before registers can be reused, so a Wave64 executing over two clock ticks will have fewer stalls than a Wave32. * [https://en.wikipedia.org/wiki/Radeon_RX_5000_series Radeon RX 5000 series from Wikipedua]* Radeon 5700 XT* Radeon 5700 === Vega GCN 5th gen === [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper] Vega cards were first released in 2017. Vega is the last in the line of the GCN Architecture: 64 threads per wavefront. Each compute unit contains 4x SIMD units, supporting a total of 40 wavefronts per compute unit (a queue of 10-~~bandwidth RAM~~wavefronts per SIMD Unit). Each SIMD unit contains 16 vALUs for general compute + 1 sALU for branching and constant logic. Each SIMD unit executes the same instruction over four clock ticks (16 vALUs x 4 clock ticks == 64 threads per Wavefront). Vega specifically added Packed FP16 instructions, such as ~~GDDR6~~ dot-product and packed add and packed multiply. From a programming level, these packed FP16 instructions are SIMD-within-SIMD, each SIMD thread could operate its own SIMD FP16 instruction akin to AVX or ~~HBM2~~SSE from the x64 architecture. ~~GDDR6 and HBM2 are designed for~~ * Radeon VII* Vega64* Vega56 === Polaris GCN 4th gen === Polaris cards were first released in 2016 under the ~~extremely parallel nature~~ AMD Radeon 400 series name. [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper] * RX 580* RX 570* RX 560 == Intel == === Intel Xe 'Gen12' === [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs(released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and ~~can provide 200GBps to 1000GBps throughput~~Xe-HPC (high-performance-computing). ~~In comparison~~ * [https: ~~a typical DDR4 channel can provide 20GBps~~//en. ~~A dual channel desktop will typically have under 50GBps bandwidth to DDR4 main memory~~wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Xe 'Gen12' GPUs on Wikipedia]

==Nvidia==

Nvidia~~'s consumer~~ line of ~~cards~~ discrete GPUs is ~~Geforce,~~ branded ~~with RTX or GTX labels. Nvidia's professional line of cards is "Quadro". Finally, Nvidia's server line of cards is "Tesla".~~ ~~Nvidia's "Titan" line of Geforce cards use~~ as GeForce for consumer ~~drivers~~, ~~but use~~ Quadro for professional or and Tesla for server ~~class chips. As such, the Titan line can cost anywhere from $1000 to $3000 per card~~.

=== Ampere Architecture ===

* GTX 1030

== ~~AMD~~ =Maxwell Architecture ===~~= CDNA ===~~ ~~CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.~~ * [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]

~~=== Navi 2X RDNA 2.0 ===~~ * [https://enweb.~~wikipedia~~archive.org/~~wiki~~web/20170721113746/http://~~RDNA_(microarchitecture)#RDNA_2 RDNA 2 from Wikipedua~~international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Architecture Whitepaper on archiv.org]

~~RDNA 2.0 cards were unveiled on October 28, 2020.~~ * [https://en.wikipedia.org/wiki/~~Radeon_RX_6000_series Radeon RX 6000 series from Wikipedua]~~ ~~=== Navi RDNA 1.0 ===~~ * [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]* [https://en.wikipedia.org/wiki/RDNA_Maxwell(microarchitecture) ~~RDNA (microarchitecture) from Wikipedua~~Maxwell] ~~RDNA~~ cards were first released in 2019. RDNA is a major change for AMD cards: the underlying hardware supports both Wave32 and Wave64 gangs of threads. Compute Units have 2x32 wide SIMD units, each of which executes 32 threads per clock tick. A Wave64 workgroup will execute on a single SIMD unit, but over two clock ticks. It should be noted that these Wave32 still have 5 cycles of latency before registers can be reused, so a Wave64 executing over two clock ticks will have fewer stalls than a Wave322014. * [https://en.wikipedia.org/wiki/Radeon_RX_5000_series Radeon RX 5000 series from Wikipedua]* Radeon 5700 XT* Radeon 5700 ~~=== Vega GCN 5th gen ===~~ ~~[https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]~~ Vega cards were first released in 2017. Vega is the last in the line of the GCN Architecture: 64 threads per wavefront. Each compute unit contains 4x SIMD units, supporting a total of 40 wavefronts per compute unit (a queue of 10-wavefronts per SIMD Unit). Each SIMD unit contains 16 vALUs for general compute + 1 sALU for branching and constant logic. Each SIMD unit executes the same instruction over four clock ticks (16 vALUs x 4 clock ticks == 64 threads per Wavefront). Vega specifically added Packed FP16 instructions, such as dot-product and packed add and packed multiply. From a programming level, these packed FP16 instructions are SIMD-within-SIMD, each SIMD thread could operate its own SIMD FP16 instruction akin to AVX or SSE from the x64 architecture. * Radeon VII* Vega64* Vega56 ~~=== Polaris GCN 4th gen ===~~ ~~Polaris cards were first released in 2016 under the AMD Radeon 400 series name.~~ ~~[https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]~~ * RX 580* RX 570* RX 560

=Instruction Throughput=

Smatovic

422

edits

Changes

GPU

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools