Difference between revisions of "Talk:GPU"

From Chessprogramming wiki
Jump to: navigation, search
m (Nvidia architectures)
(CPW GPU article: new section)
(23 intermediate revisions by the same user not shown)
Line 1: Line 1:
Heyho, just a minor thing, the notation of Nvidia should be unified.
+
== AMD architectures ==
  
The official name is Nvidia Corporation, on their webpage they refer as NVIDIA, and in their logo styling nVIDIA, formerly nVidia.
+
My own conclusions are:
  
Bests,
+
* TeraScale has VLIW design.
Srdja
+
* GCN has 16 wide SIMD, executing a Wavefront of 64 threads over 4 cycles.
 +
* RDNA has 32 wide SIMD, executing a Wavefront:32 over 1 cycle and Wavefront:64 over two cycles.
  
== SIMT ==
+
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 10:16, 22 April 2021 (CEST)
 
 
SIMT is not only about running multiple threads in a Warp resp. Wavefront, it is more about running multiple waves of Warps resp. Wavefronts on the same SIMD unit to hide memory latencies.
 
  
 
== Nvidia architectures ==
 
== Nvidia architectures ==
Line 16: Line 15:
 
Nevertheless, my own conclusions are:
 
Nevertheless, my own conclusions are:
  
Tesla has 8 wide SIMD, executing a Warp of 32 threads over 4 cycles.
+
* Tesla has 8 wide SIMD, executing a Warp of 32 threads over 4 cycles.
 +
 
 +
* Fermi has 16 wide SIMD, executing a Warp of 32 threads over 2 cycles.
 +
 
 +
* Kepler is somehow odd, not sure how the compute units are partitioned.
 +
 
 +
* Maxwell and Pascal have 32 wide SIMD, executing a Warp of 32 threads over 1 cycle.
 +
 
 +
* Volta and Turing seem to have 16 wide FPU SIMDs, but my own experiments show 32 wide VALU.
 +
 
 +
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 10:17, 22 April 2021 (CEST)
 +
 
 +
== SIMD + Scalar Unit ==
 +
 
 +
It seems every SIMD unit has one scalar unit on GPU architectures, executing things like branch-conditions or special functions the SIMD ALUs are not capable of.
 +
 
 +
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 20:21, 22 April 2021 (CEST)
 +
 
 +
==  embedded CPU controller ==
  
Fermi has 16 wide SIMD, executing a Warp of 32 threads over 2 cycles.
+
It is not documented in the whitepapers, but it seems that every discrete GPU has an embedded CPU controller (e.g. Nvidia Falcon) who (speculation) launches the kernels.
  
Kepler is somehow odd, not sure how the compute units are partitioned.
+
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 10:36, 22 April 2021 (CEST)
  
Maxwell and Pascal have 32 wide SIMD, executing a Warp of 32 threads over 1 cycle.
+
== CPW GPU article ==
  
Volta and Turing seem to have 16 wide FPU SIMDs, but my own experiments show 32 wide VALU.
+
A suggestion of mine, keep this GPU article as an generalized overview of GPUs, with incremental updates for different frameworks and architectures. GPUs and GPGPU is a moving target with different platforms offering new feature sets, better open own articles for things like GPGPU, SIMT, CUDA, ROCm, oneAPI, Metal or simply link to Wikipedia containing the newest specs and infos.
  
According to AMD papers every SIMD unit has one scalar unit, Nvidia seems to have one SFU, special function unit, per SIMD.
+
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 21:29, 27 April 2021 (CEST)

Revision as of 21:29, 27 April 2021

AMD architectures

My own conclusions are:

  • TeraScale has VLIW design.
  • GCN has 16 wide SIMD, executing a Wavefront of 64 threads over 4 cycles.
  • RDNA has 32 wide SIMD, executing a Wavefront:32 over 1 cycle and Wavefront:64 over two cycles.

Smatovic (talk) 10:16, 22 April 2021 (CEST)

Nvidia architectures

Afaik Nvidia did never official mention SIMD in their papers as hardware architecture, with Tesla they only referred to as SIMT.

Nevertheless, my own conclusions are:

  • Tesla has 8 wide SIMD, executing a Warp of 32 threads over 4 cycles.
  • Fermi has 16 wide SIMD, executing a Warp of 32 threads over 2 cycles.
  • Kepler is somehow odd, not sure how the compute units are partitioned.
  • Maxwell and Pascal have 32 wide SIMD, executing a Warp of 32 threads over 1 cycle.
  • Volta and Turing seem to have 16 wide FPU SIMDs, but my own experiments show 32 wide VALU.

Smatovic (talk) 10:17, 22 April 2021 (CEST)

SIMD + Scalar Unit

It seems every SIMD unit has one scalar unit on GPU architectures, executing things like branch-conditions or special functions the SIMD ALUs are not capable of.

Smatovic (talk) 20:21, 22 April 2021 (CEST)

embedded CPU controller

It is not documented in the whitepapers, but it seems that every discrete GPU has an embedded CPU controller (e.g. Nvidia Falcon) who (speculation) launches the kernels.

Smatovic (talk) 10:36, 22 April 2021 (CEST)

CPW GPU article

A suggestion of mine, keep this GPU article as an generalized overview of GPUs, with incremental updates for different frameworks and architectures. GPUs and GPGPU is a moving target with different platforms offering new feature sets, better open own articles for things like GPGPU, SIMT, CUDA, ROCm, oneAPI, Metal or simply link to Wikipedia containing the newest specs and infos.

Smatovic (talk) 21:29, 27 April 2021 (CEST)