Pretraining Large Language Models with NVFP4

Pretrain in 4-bit floating point — micro-block scaling and Hadamard transforms match FP8 over 10T tokens.

NVIDIA · arXiv 2025 · Foundations. Read the paper ↗

A free, interactive, animated visual explainer of Pretraining Large Language Models with NVFP4 — every exhibit computed from the real formulas, with verbatim quotes from the source.

Questions

What is Pretraining Large Language Models with NVFP4?
Pretrain in 4-bit floating point — micro-block scaling and Hadamard transforms match FP8 over 10T tokens.
Who published Pretraining Large Language Models with NVFP4, and where?
NVIDIA — arXiv 2025 (arXiv:2509.25149).
Where can I find a visual explainer of Pretraining Large Language Models with NVFP4?
Right here — a free, interactive, animated walkthrough of the whole paper, with exhibits computed from the real formulas and verbatim quotes from the source.

Related explainers