Higgs Audio v2

Naturalness at scale via DualFFN — a Llama-3.2-3B with a forked audio FFN emits interleaved text + delay-patterned RVQ codes, pretrained on 10M hours with no post-training.

Boson AI · 2025 · Speech / TTS. Read the paper ↗

A free, interactive, animated visual explainer of Higgs Audio v2 — every exhibit computed from the real formulas, with verbatim quotes from the source.

Questions

What is Higgs Audio v2?
Naturalness at scale via DualFFN — a Llama-3.2-3B with a forked audio FFN emits interleaved text + delay-patterned RVQ codes, pretrained on 10M hours with no post-training.
Who published Higgs Audio v2, and where?
Boson AI — 2025 (its official release).
Where can I find a visual explainer of Higgs Audio v2?
Right here — a free, interactive, animated walkthrough of the whole paper, with exhibits computed from the real formulas and verbatim quotes from the source.

Related explainers