Intel Atom (Bonnell, Saltwell, Silvermont and Goldmont) | SSE3 (64-bit) | 2 | 4 | 0 |
Intel Core (Merom, Penryn)
Intel Nehalem (Nehalem, Westmere) | SSE4 (128-bit) | 4 | 8 | 0 |
Intel Sandy Bridge (Sandy Bridge, Ivy Bridge) | AVX (256-bit) | 8 | 16 | 0 |
Intel Haswell (Haswell, Devil's Canyon, Broadwell)
Intel Skylake (Skylake, Kaby Lake, Coffee Lake, Whiskey lake, Amber lake) | AVX2 & FMA (256-bit) | 16 | 32 | 0 |
Intel Xeon Phi (Knights Corner) | SSE & FMA (256-bit) | 16 | 32 | 0 |
Intel Skylake-X
Intel Xeon Phi (Knights Landing, Knights Mill) | AVX-512 & FMA (512-bit) | 32 | 64 | 0 |
AMD Bobcat | AMD64 (64-bit) | 2 | 4 | 0 |
AMD Jaguar
AMD Puma | AVX (128-bit) | 4 | 8 | 0 |
AMD K10 | SSE4/4a (128-bit) | 4 | 8 | 0 |
AMD Bulldozer (Piledriver, Steamroller, Excavator) | AVX (128-bit) Bulldozer-Steamroller AVX2 (128-bit) Excavator
FMA3 (Bulldozer)
FMA3/4 (Piledriver-Excavator) | 4 | 8 | 0 |
AMD Zen (Ryzen 1000 series, Threadripper 1000 series, Epyc Naples)
AMD Zen+ (Ryzen 2000 series, Threadripper 2000 series) | AVX2 & FMA (128-bit, 256-bit decoding) | 8 | 16 | 0 |
AMD Zen 2 (Ryzen 3000 series, Threadripper 3000 series, Epyc Rome))
AMD Zen 3 (Ryzen 5000 series) | AVX2 & FMA (256-bit) | 16 | 32 | 0 |
ARM Cortex-A7, A9, A15 | ARMv7 | 1 | 8 | 0 |
ARM Cortex-A32, A35, A53, A55, A72, A73, A75 | ARMv8 | 2 | 8 | 0 |
ARM Cortex-A57 | ARMv8 | 4 | 8 | 0 |
ARM Cortex-A76, A77 | ARMv8 | 8 | 16 | 0 |
Qualcomm Krait | ARMv8 | 1 | 8 | 0 |
Qualcomm Kryo (1xx - 3xx) | ARMv8 | 2 | 8 | 0 |
Qualcomm Kryo (4xx - 5xx) | ARMv8 | 8 | 16 | 0 |
Samsung Exynos M1 and M2 | ARMv8 | 2 | 8 | 0 |
Samsung Exynos M3 and M4 | ARMv8 | 3 | 12 | 0 |
IBM PowerPC A2 (Blue Gene/Q) | ? | 8 | 8 (as FP64) | 0 |
Hitachi SH-4 | SH-4 | 1 | 7 | 0 |
Nvidia Fermi (only GeForce GTX 465–480, 560 Ti, 570-590) | PTX | 1/4 (locked by driver, 1 in hardware) | 2 | 0 |
Nvidia Fermi (only Quadro 600-2000) | PTX | 1/8 | 2 | 0 |
Nvidia Fermi (only Quadro 4000–7000, Tesla) | PTX | 1 | 2 | 0 |
Nvidia Kepler (GeForce (except Titan and Titan Black), Quadro (except K6000), Tesla K10) | PTX | 1/12 (for GK110: locked by driver, 2/3 in hardware) | 2 | 0 |
Nvidia Kepler (GeForce GTX Titan and Titan Black, Quadro K6000, Tesla (except K10)) | PTX | 2/3 | 2 | 0 |
Nvidia Maxwell
Nvidia Pascal (all except Quadro GP100 and Tesla P100) | PTX | 1/16 | 2 | 1/32 |
Nvidia Pascal (only Quadro GP100 and Tesla P100) | PTX | 1 | 2 | 4 |
Nvidia Volta | PTX | 1 | 2 (FP32) + 2 (INT32) | 16 |
Nvidia Turing (only GeForce 16XX) | PTX | 1/16 | 2 (FP32) + 2 (INT32) | 4 |
Nvidia Turing (all except GeForce 16XX) | PTX | 1/16 | 2 (FP32) + 2 (INT32) | 16 |
Nvidia Ampere] (only A100) | PTX | 2 | 2 (FP32) + 2 (INT32) | 32 |
Nvidia Ampere (only GeForce) | PTX | 1/32 | 2 (FP32) + 0 (INT32) or 1 (FP32) + 1 (INT32) | 16 |
AMD GCN (only Radeon Pro WX 2100-7100) | GCN | 1/8 | 2 | 2 |
AMD GCN (all except Radeon VII, Instinct MI50 and MI60, Radeon Pro WX 2100-7100) | GCN | 1/8 | 2 | 4 |
AMD GCN Vega 20 (only Radeon VII) | GCN | 1/2 (locked by driver, 1 in hardware) | 2 | 4 |
AMD GCN Vega 20 (only Radeon Instinct MI50 / MI60 and Radeon Pro VII) | GCN | 1 | 2 | 4 |
AMD RDNA
AMD RDNA 2 | RDNA | 1/8 | 2 | 4 |
AMD CDNA | CDNA | 1 | 4 (FP32) | 16 |
Graphcore Colossus GC2 | ? | 0 | 18 | 72 |
Graphcore Colossus GC200 Mk2 | ? | 0 | 18 | 144 |