Changes

Jump to: navigation, search

GPU

260 bytes added, 19:38, 9 August 2019
Grids and NDRange
Grids and NDRanges can be 1-dimensional, 2-dimensional, or 3-dimensional. 2-dimensional grids are common for screen-space operation such as pixel shaders. While 3-dimensional grids are useful for specifying many operations per pixel (such as a raytracer, which may launch 5000 rays per pixel).
The most important note is that Grids and NDRanges may not execute concurrently with each other. Some degree of sequential processing may happen. As such, communication across a Grid or NDRange is difficult to achieve (If thread #0 creates a Spinlock or Mutex waiting for thread #1000000 to communicate with it, modern hardware will probably never have the two threads executing concurrently with each other and the code would deadlock). In practice, the easiest mechanism for Grid or NDRange sized synchronization is to wait for the kernel to finish executing, and : to have the CPU split tasks as appropriatewait and process the results in between Grid or NDRanges. CPUs  For example: LeelaZero will schedule a Grid for each CNN evaluation. The CPU will traverse the MCTS tree, and GPUs can easily work as mark off other CNN evaluation locations (each their own Grid). When the GPU finishes a team Grid, the CUDA API provides asynchronous or synchronous APIs to accomplish their tasksinform the CPU of the Grid completion.
= Architectures and Physical Hardware =

Navigation menu