Changes

Jump to: navigation, search

Talk:GPU

2,527 bytes added, 15:09, 14 November 2022
m
Legacy GPGPU
Heyho, just a minor thing, the notation of Nvidia should be unified. == AMD architectures ==
The official name is Nvidia Corporation, on their webpage they refer as NVIDIA, and in their logo styling nVIDIA, formerly nVidia.My own conclusions are:
Bests* TeraScale has VLIW design.* GCN has 16 wide SIMD,executing a Wavefront of 64 threads over 4 cycles.Srdja* RDNA has 32 wide SIMD, executing a Wavefront:32 over 1 cycle and Wavefront:64 over two cycles.
== SIMT == SIMT is not only about running multiple threads in a Warp resp. Wavefront[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 10:16, it is more about running multiple waves of Warps resp. Wavefronts on the same SIMD unit to hide memory latencies.22 April 2021 (CEST)
== Nvidia architectures ==
Nevertheless, my own conclusions are:
* Tesla has 8 wide SIMD, executing a Warp of 32 threads over 4 cycles. * Fermi has 16 wide SIMD, executing a Warp of 32 threads over 2 cycles. * Kepler is somehow odd, not sure how the compute units are partitioned. * Maxwell and Pascal have 32 wide SIMD, executing a Warp of 32 threads over 1 cycle. * Volta and Turing seem to have 16 wide FPU SIMDs, but my own experiments show 32 wide VALU. [[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 10:17, 22 April 2021 (CEST) == SIMD + Scalar Unit == It seems every SIMD unit has one scalar unit on GPU architectures, executing things like branch-conditions or special functions the SIMD ALUs are not capable of. [[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 20:21, 22 April 2021 (CEST) == embedded CPU controller == It is not documented in the whitepapers, but it seems that every discrete GPU has an embedded CPU controller (e.g. Nvidia Falcon) who (speculation) launches the kernels. [[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 10:36, 22 April 2021 (CEST) == GPUs and Duncan's taxonomy ==It is not clear to me how the underlying hardware of GPU SIMD units of architectures with unified shader architecture is realized by different vendors, there is the concept of bit-sliced ALUs, there is the concept of pipelined vector processors, there is the concept of SIMD units with fix bit-width ALUs. The white papers from different vendors leave room for speculation, the different instruction throughputs for higher precision and lower precision too, what is left to the programmer is to do microbenchmarking and make conclusions on their own. https://en.wikipedia.org/wiki/Duncan%27s_taxonomy https://en.wikipedia.org/wiki/Flynn%27s_taxonomy [[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 13:58, 16 December 2021 (CET) == CPW GPU article == A suggestion of mine, keep this GPU article as an generalized overview of GPUs, with incremental updates for different frameworks and architectures. GPUs and GPGPU is a moving target with different platforms offering new feature sets, better open own articles for things like GPGPU, SIMT, CUDA, ROCm, oneAPI, Metal or simply link to Wikipedia containing the newest specs and infos. [[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 21:29, 27 April 2021 (CEST)
Fermi has 16 wide SIMD== GPGPU architectures ==Regarding GPGPU architectures or frameworks, executing a Warp link to the architecture white paper, instruction set architecture, programming guide, and link to Wikipedia with a list of 32 threads over 2 cyclesthe concrete models with specs would be nice, if available.
Kepler is somehow odd[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 09:21, not sure how the compute units are partitioned.25 October 2021 (CEST)
Maxwell and Pascal have 32 wide SIMD, executing a Warp of 32 threads over 1 cycle.== Legacy GPGPU ==
Volta This article does not cover legacy, pre 2007, GPGPU methods, how to use pixel, vertex, geometry, tessellation and Turing seem compute shaders via OpenGL or DirectX for GPGPU. I can imagine it is possible to backport a neural network Lc0 backend to have 16 wide FPU SIMDsa certain DirextX/OpenGL API, but my own experiments show 32 wide VALUI doubt it has real contemporary relevance (running Lc0 on an SGI Indy or alike).
According to AMD papers every SIMD unit has one scalar unit[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 14:09, Nvidia seems to have one SFU, special function unit, per SIMD.14 November 2022 (CET)
422
edits

Navigation menu