Chatterbox

A single scalar dials emotion — one number becomes one conditioning token in a 0.5B Llama LM — plus classifier-free guidance at the AR token stage, an S3Gen flow-matching codec, and a watermark on every clip.

Resemble AI · 2025 · Speech / TTS. Read the paper ↗

A free, interactive, animated visual explainer of Chatterbox — every exhibit computed from the real formulas, with verbatim quotes from the source.

Questions

What is Chatterbox?
A single scalar dials emotion — one number becomes one conditioning token in a 0.5B Llama LM — plus classifier-free guidance at the AR token stage, an S3Gen flow-matching codec, and a watermark on every clip.
Who published Chatterbox, and where?
Resemble AI — 2025 (its official release).
Where can I find a visual explainer of Chatterbox?
Right here — a free, interactive, animated walkthrough of the whole paper, with exhibits computed from the real formulas and verbatim quotes from the source.

Related explainers