Changes

GPU

1,330 bytes removed, 22:10, 26 August 2019

→‎Memory

* RX 570

* RX 560

~~=Memory=~~

The memory hierarchy of an GPU consists in main of private memory (registers accessed by an single thread resp. work-item), local memory (shared by threads of an block resp. work-items of an work-group ), constant memory, different types of cache and global memory. Size, latency and bandwidth vary between vendors and architectures.

Here the data for the Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] as an example: <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref>

* 128 KiB private memory per compute unit

* 48 KiB (16 KiB) local memory per compute unit (configurable)

* 64 KiB constant memory

* 8 KiB constant cache per compute unit

* 16 KiB (48 KiB) L1 cache per compute unit (configurable)

* 768 KiB L2 cache

* 1.5 GiB to 3 GiB global memory

Here the data for the AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) as an example: <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref>

* 256 KiB private memory per compute unit

* 64 KiB local memory per compute unit

* 64 KiB constant memory

* 16 KiB constant cache per four compute units

* 16 KiB L1 cache per compute unit

* 768 KiB L2 cache

* 3 GiB to 6 GiB global memory

=Instruction Throughput=

PercivalTiglao

85

edits

Changes

GPU

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools