Changes

GPU

No change in size, 12:08, 23 October 2021

m

→‎Floating Point Instruction Throughput

: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.

==Floating -Point Instruction Throughput==

* FP32

: Consumer GPU performance is measured usually in single-precision (32-bit) floating -point FMA, (fused-multiply-add, ) throughput.

* FP64

: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating -point operations throughput than server brand GPUs.

* FP16

: Some GPGPU architectures offer half-precision (16-bit) floating -point operation throughput with an FP32:FP16 ratio of 1:2.

==Tensors==

422

edits