§ 01 / FEATURE

The quiet revolution
in how machines think.

For decades the CPU sat at the center of computing. A second processor, born for graphics, has rewritten that center of gravity. This is the field guide.

Read the explainer →Latest news

~10³×

speedup over CPU on dense FP16 matrix workloads

20 PFLOPS

FP4 sparse, in a flagship 2025 accelerator card

$115B

datacenter revenue at the top vendor, fiscal 2025

§ 03

Demo / CPU vs GPU

One core, in a hurry.
Many cores, in unison.

Both processors are computing the same 2048×2048 matrix multiplication. The CPU walks the grid one tile at a time. The GPU lights every tile at once. Press play.

CPU

8 cores · 4.2 GHz

0%

tiles processed0 / 256

GPU

16,896 cores · 1.8 GHz

0%

tiles processed0 / 256

§ 02

The field guide

Seven sections. Pick where to start.

Each section is its own deep dive. The explainer is the gentlest entry point; the hardware taxonomy and benchmarks chart are the most opinionated.

§ 02

What is accelerated computing?

Four questions, answered honestly: what it is, why a CPU is not enough, whether it is just GPUs, and when it stops being worth it.

Read the explainer→

§ 04

Anatomy of an accelerator

An interactive diagram of host CPU, PCIe, the on-device scheduler, streaming multiprocessors, HBM, and L2 cache.

See the diagram→

§ 05

Where it shows up

Six fields that look different now than they did a decade ago: AI training, simulation, graphics, genomics, finance, and edge inference.

Browse the use cases→

§ 06

GPU vs TPU vs NPU vs FPGA

A side-by-side taxonomy of the four accelerator families, ranked on flexibility, throughput, and power.

Compare the hardware→

§ 07

The lines that diverge

Eighteen years of FP16 throughput on CPUs and GPUs. The CPU has roughly tripled. The accelerator has improved by three orders of magnitude.

See the chart→

§ 08

Twenty-five years, in seven beats

From programmable shaders to NPUs in every laptop and phone — the history of how the second processor took center stage.

Read the timeline→

§ 10

The terms

The vocabulary an engineer uses to talk about accelerated systems, kept short and operational. CUDA, HBM, MFU, tensor cores, systolic arrays, and the rest.

Open the glossary→

§ 09

The wire

News this week

All news →

Hardware

Intel details Crescent Island data center GPU with 480 GB of LPDDR5XIntel's Xe3P inference GPU carries up to 480 GB of LPDDR5X in a 350W air-cooled PCIe card and adds native FP4 and MXFP4 support, with sampling in H2 2026.

Intel · 2 days ago4 min →Edge

NVIDIA unveils RTX Spark, a Grace-plus-Blackwell superchip with 128 GB unified memoryNVIDIA's RTX Spark pairs a 20-core Grace CPU with a 6,144-core Blackwell GPU over NVLink-C2C, claiming 1 petaflop of AI compute for on-device agents.

NVIDIA · 3 days ago4 min →Hardware

Phoronix benchmarks NVIDIA Vera CPU against Intel and AMD x86 flagshipsFirst independent Vera review shows a 1.5x geomean lead over a 128-core x86 processor and a 1.6x gain over Grace, with 1.2 TB/s of LPDDR5X bandwidth.

NVIDIA · 1 week ago4 min →Hardware

AMD launches Instinct MI350P PCIe with 144GB HBM3EThe MI350P drops AMD's MI350-class compute into a dual-slot PCIe card aimed at on-prem inference, claiming up to 4,600 TFLOPS at MXFP4.

AMD · 4 weeks ago4 min →Hardware

AMD previews Instinct MI430X with 200+ FP64 TFLOPSAMD projects its upcoming HPC accelerator at over 200 FP64 TFLOPS, more than six times its claim for NVIDIA Rubin.