Changes

Jump to: navigation, search

GPU

731 bytes added, 15:52, 16 April 2021
Architectures and Physical Hardware: added Intel and ARM
= Architectures and Physical Hardware =
The market is split into three two categories: server, professionalintegrated and discrete GPUs. The first being the most important by quantity, and consumerthe second by performance. Consumer cards Discrete GPUs are cheapest and are primarily targeted divided as consumer brands for the video game market. Professional cards have better driver support playing 3D games, professional brands for 3d 3D CAD/CGI programs like Autocadand and server brands for big-data and number-crunching workloads. FinallyEach brand offering different feature sets in drivers, server cards provide virtualization servicesVRAM, allowing cloud companies to virtually split their cards between customersor computation abilities.
Consumer class GPUs cost anywhere == ARM Mali ==The Mali GPU variants can be found on various systems on chips (SoCs) from $100 to $1000different vendors. Professional cards can run to $2000, while server class cards can cost as much as $10,000Since Midgard (2012) with unified-shader-model OpenCL support is offered.
=== Bifrost (2016) and Valhall (2019) === * [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide] === Midgard (2012) ===* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide] == AMD ==AMD line of discrete GPUs use highis branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server. === CDNA === CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020. * [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper] === Navi 2X RDNA 2.0 === * [https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA 2 from Wikipedua]* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture] RDNA 2.0 cards were unveiled on October 28, 2020. * [https://en.wikipedia.org/wiki/Radeon_RX_6000_series Radeon RX 6000 series from Wikipedua] === Navi RDNA 1.0 ===  * [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]* [https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA (microarchitecture) from Wikipedua] RDNA cards were first released in 2019. RDNA is a major change for AMD cards: the underlying hardware supports both Wave32 and Wave64 gangs of threads. Compute Units have 2x32 wide SIMD units, each of which executes 32 threads per clock tick. A Wave64 workgroup will execute on a single SIMD unit, but over two clock ticks. It should be noted that these Wave32 still have 5 cycles of latency before registers can be reused, so a Wave64 executing over two clock ticks will have fewer stalls than a Wave32. * [https://en.wikipedia.org/wiki/Radeon_RX_5000_series Radeon RX 5000 series from Wikipedua]* Radeon 5700 XT* Radeon 5700 === Vega GCN 5th gen === [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper] Vega cards were first released in 2017. Vega is the last in the line of the GCN Architecture: 64 threads per wavefront. Each compute unit contains 4x SIMD units, supporting a total of 40 wavefronts per compute unit (a queue of 10-bandwidth RAMwavefronts per SIMD Unit). Each SIMD unit contains 16 vALUs for general compute + 1 sALU for branching and constant logic. Each SIMD unit executes the same instruction over four clock ticks (16 vALUs x 4 clock ticks == 64 threads per Wavefront). Vega specifically added Packed FP16 instructions, such as GDDR6 dot-product and packed add and packed multiply. From a programming level, these packed FP16 instructions are SIMD-within-SIMD, each SIMD thread could operate its own SIMD FP16 instruction akin to AVX or HBM2SSE from the x64 architecture. GDDR6 and HBM2 are designed for  * Radeon VII* Vega64* Vega56 === Polaris GCN 4th gen ===  Polaris cards were first released in 2016 under the extremely parallel nature AMD Radeon 400 series name. [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper] * RX 580* RX 570* RX 560 == Intel == === Intel Xe 'Gen12' === [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs(released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and can provide 200GBps to 1000GBps throughputXe-HPC (high-performance-computing). In comparison * [https: a typical DDR4 channel can provide 20GBps//en. A dual channel desktop will typically have under 50GBps bandwidth to DDR4 main memorywikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Xe 'Gen12' GPUs on Wikipedia]
==Nvidia==
Nvidia's consumer line of cards discrete GPUs is Geforce, branded with RTX or GTX labels. Nvidia's professional line of cards is "Quadro". Finally, Nvidia's server line of cards is "Tesla". Nvidia's "Titan" line of Geforce cards use as GeForce for consumer drivers, but use Quadro for professional or and Tesla for server class chips. As such, the Titan line can cost anywhere from $1000 to $3000 per card.
=== Ampere Architecture ===
* GTX 1030
== AMD =Maxwell Architecture ==== CDNA === CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020. * [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]
=== Navi 2X RDNA 2.0 === * [https://enweb.wikipediaarchive.org/wikiweb/20170721113746/http://RDNA_(microarchitecture)#RDNA_2 RDNA 2 from Wikipeduainternational.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Architecture Whitepaper on archiv.org]
RDNA 2.0 cards were unveiled on October 28, 2020. * [https://en.wikipedia.org/wiki/Radeon_RX_6000_series Radeon RX 6000 series from Wikipedua] === Navi RDNA 1.0 ===  * [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]* [https://en.wikipedia.org/wiki/RDNA_Maxwell(microarchitecture) RDNA (microarchitecture) from WikipeduaMaxwellRDNA cards were first released in 2019. RDNA is a major change for AMD cards: the underlying hardware supports both Wave32 and Wave64 gangs of threads. Compute Units have 2x32 wide SIMD units, each of which executes 32 threads per clock tick. A Wave64 workgroup will execute on a single SIMD unit, but over two clock ticks. It should be noted that these Wave32 still have 5 cycles of latency before registers can be reused, so a Wave64 executing over two clock ticks will have fewer stalls than a Wave322014* [https://en.wikipedia.org/wiki/Radeon_RX_5000_series Radeon RX 5000 series from Wikipedua]* Radeon 5700 XT* Radeon 5700 === Vega GCN 5th gen === [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper] Vega cards were first released in 2017. Vega is the last in the line of the GCN Architecture: 64 threads per wavefront. Each compute unit contains 4x SIMD units, supporting a total of 40 wavefronts per compute unit (a queue of 10-wavefronts per SIMD Unit). Each SIMD unit contains 16 vALUs for general compute + 1 sALU for branching and constant logic. Each SIMD unit executes the same instruction over four clock ticks (16 vALUs x 4 clock ticks == 64 threads per Wavefront). Vega specifically added Packed FP16 instructions, such as dot-product and packed add and packed multiply. From a programming level, these packed FP16 instructions are SIMD-within-SIMD, each SIMD thread could operate its own SIMD FP16 instruction akin to AVX or SSE from the x64 architecture. * Radeon VII* Vega64* Vega56 === Polaris GCN 4th gen ===  Polaris cards were first released in 2016 under the AMD Radeon 400 series name. [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper] * RX 580* RX 570* RX 560
=Instruction Throughput=
422
edits

Navigation menu