422
edits
Changes
Talk:GPU
,→CPW GPU article: new section
== AMD architectures ==
My own conclusions are:
* TeraScale has VLIW design.
* GCN has 16 wide SIMD, executing a Wavefront of 64 threads over 4 cycles.
* RDNA has 32 wide SIMD, executing a Wavefront:32 over 1 cycle and Wavefront:64 over two cycles.
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 10:16, 22 April 2021 (CEST)
== Nvidia architectures ==
Nevertheless, my own conclusions are:
* Tesla has 8 wide SIMD, executing a Warp of 32 threads over 4 cycles.
* Fermi has 16 wide SIMD, executing a Warp of 32 threads over 2 cycles.
* Kepler is somehow odd, not sure how the compute units are partitioned.
* Maxwell and Pascal have 32 wide SIMD, executing a Warp of 32 threads over 1 cycle.
* Volta and Turing seem to have 16 wide FPU SIMDs, but my own experiments show 32 wide VALU.
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 1110:4617, 18 22 April 2021 (CEST)
== SIMD + Scalar Unit ==