Difference between revisions of "Talk:GPU"

From Chessprogramming wiki
Jump to: navigation, search
(Nvidia)
(SIMT)
Line 1: Line 1:
== SIMT ==
 
 
SIMT is not only about running multiple threads in a Warp resp. Wavefront, it is more about running multiple waves of Warps resp. Wavefronts on the same SIMD unit to hide memory latencies.
 
 
 
== Nvidia architectures ==
 
== Nvidia architectures ==
  

Revision as of 11:43, 18 April 2021

Nvidia architectures

Afaik Nvidia did never official mention SIMD in their papers as hardware architecture, with Tesla they only referred to as SIMT.

Nevertheless, my own conclusions are:

Tesla has 8 wide SIMD, executing a Warp of 32 threads over 4 cycles.

Fermi has 16 wide SIMD, executing a Warp of 32 threads over 2 cycles.

Kepler is somehow odd, not sure how the compute units are partitioned.

Maxwell and Pascal have 32 wide SIMD, executing a Warp of 32 threads over 1 cycle.

Volta and Turing seem to have 16 wide FPU SIMDs, but my own experiments show 32 wide VALU.

SIMD + Scalar Unit

According to AMD papers every SIMD unit has one scalar unit, Nvidia seems to have one SFU, special function unit, per SIMD.

embedded CPU controller

It is not documented in the white papers, but it seems that every discrete GPU has an embedded CPU controller (e.g. Nvidia Falcon) who (speculation) launches the Kernels.