| Intel Atom (Bonnell, Saltwell, Silvermont and Goldmont) |  SSE3 (64-bit) |  2 |  4 |  0 |  
 Intel Core (Merom, Penryn) 
Intel Nehalem (Nehalem, Westmere) |  SSE4 (128-bit) |  4 |  8 |  0 |  
 | Intel Sandy Bridge (Sandy Bridge, Ivy Bridge) |  AVX (256-bit) |  8 |  16 |  0 |  
 Intel Haswell (Haswell, Devil's Canyon, Broadwell) 
Intel Skylake (Skylake, Kaby Lake, Coffee Lake, Whiskey lake, Amber lake) |  AVX2 & FMA (256-bit) |  16 |  32 |  0 |  
 | Intel Xeon Phi (Knights Corner) |  SSE & FMA (256-bit) |  16 |  32 |  0 |  
 Intel Skylake-X 
Intel Xeon Phi (Knights Landing, Knights Mill) |  AVX-512 & FMA (512-bit) |  32 |  64 |  0 |  
 | AMD Bobcat |  AMD64 (64-bit) |  2 |  4 |  0 |  
 AMD Jaguar 
AMD Puma |  AVX (128-bit) |  4 |  8 |  0 |  
 | AMD K10 |  SSE4/4a (128-bit) |  4 |  8 |  0 |  
 | AMD Bulldozer (Piledriver, Steamroller, Excavator) |  AVX (128-bit) Bulldozer-Steamroller AVX2 (128-bit) Excavator  
FMA3 (Bulldozer)  
FMA3/4 (Piledriver-Excavator) |  4 |  8 |  0 |  
 AMD Zen (Ryzen 1000 series, Threadripper 1000 series, Epyc Naples) 
AMD Zen+ (Ryzen 2000 series, Threadripper 2000 series) |  AVX2 & FMA (128-bit, 256-bit decoding) |  8 |  16 |  0 |  
 AMD Zen 2 (Ryzen 3000 series, Threadripper 3000 series, Epyc Rome)) 
AMD Zen 3 (Ryzen 5000 series) |  AVX2 & FMA (256-bit) |  16 |  32 |  0 |  
 | ARM Cortex-A7, A9, A15 |  ARMv7 |  1 |  8 |  0 |  
 | ARM Cortex-A32, A35, A53, A55, A72, A73, A75 |  ARMv8 |  2 |  8 |  0 |  
 | ARM Cortex-A57 |  ARMv8 |  4 |  8 |  0 |  
 | ARM Cortex-A76, A77 |  ARMv8 |  8 |  16 |  0 |  
 | Qualcomm Krait |  ARMv8 |  1 |  8 |  0 |  
 | Qualcomm Kryo (1xx - 3xx) |  ARMv8 |  2 |  8 |  0 |  
 | Qualcomm Kryo (4xx - 5xx) |  ARMv8 |  8 |  16 |  0 |  
 | Samsung Exynos M1 and M2 |  ARMv8 |  2 |  8 |  0 |  
 | Samsung Exynos M3 and M4 |  ARMv8 |  3 |  12 |  0 |  
 | IBM PowerPC A2 (Blue Gene/Q) |  ? |  8 |  8 (as FP64) |  0 |  
 | Hitachi SH-4 |  SH-4 |  1 |  7 |  0 |  
 | Nvidia Fermi (only GeForce GTX 465–480, 560 Ti, 570-590) |  PTX |  1/4 (locked by driver, 1 in hardware) |  2 |  0 |  
 | Nvidia Fermi (only Quadro 600-2000) |  PTX |  1/8 |  2 |  0 |  
 | Nvidia Fermi (only Quadro 4000–7000, Tesla) |  PTX |  1 |  2 |  0 |  
 | Nvidia Kepler (GeForce (except Titan and Titan Black), Quadro (except K6000), Tesla K10) |  PTX |  1/12 (for GK110: locked by driver, 2/3 in hardware) |  2 |  0 |  
 | Nvidia Kepler (GeForce GTX Titan and Titan Black, Quadro K6000, Tesla (except K10)) |  PTX |  2/3 |  2 |  0 |  
 Nvidia Maxwell 
Nvidia Pascal (all except Quadro GP100 and Tesla P100) |  PTX |  1/16 |  2 |  1/32 |  
 | Nvidia Pascal (only Quadro GP100 and Tesla P100) |  PTX |  1 |  2 |  4 |  
 | Nvidia Volta |  PTX |  1 |  2 (FP32) + 2 (INT32) |  16 |  
 | Nvidia Turing (only GeForce 16XX) |  PTX |  1/16 |  2 (FP32) + 2 (INT32) |  4 |  
 | Nvidia Turing (all except GeForce 16XX) |  PTX |  1/16 |  2 (FP32) + 2 (INT32) |  16 |  
 | Nvidia Ampere] (only A100) |  PTX |  2 |  2 (FP32) + 2 (INT32) |  32 |  
 | Nvidia Ampere (only GeForce) |  PTX |  1/32 |  2 (FP32) + 0 (INT32) or 1 (FP32) + 1 (INT32) |  16 |  
 | AMD GCN (only Radeon Pro WX 2100-7100) |  GCN |  1/8 |  2 |  2 |  
 | AMD GCN (all except Radeon VII, Instinct MI50 and MI60, Radeon Pro WX 2100-7100) |  GCN |  1/8 |  2 |  4 |  
 | AMD GCN Vega 20 (only Radeon VII) |  GCN |  1/2 (locked by driver, 1 in hardware) |  2 |  4 |  
 | AMD GCN Vega 20 (only Radeon Instinct MI50 / MI60 and Radeon Pro VII) |  GCN |  1 |  2 |  4 |  
 AMD RDNA 
AMD RDNA 2 |  RDNA |  1/8 |  2 |  4 |  
 | AMD CDNA |  CDNA |  1 |  4 (FP32) |  16 |  
 | Graphcore Colossus GC2 |  ? |  0 |  18 |  72 |  
 | Graphcore Colossus GC200 Mk2 |  ? |  0 |  18 |  144 |