Who published Kokoro, and where?

hexgrad — 2025 (its official release).

Where can I find a visual explainer of Kokoro?

Right here — a free, interactive, animated walkthrough of the whole paper, with exhibits computed from the real formulas and verbatim quotes from the source.

Kokoro

TTS without any of this — an 82M non-autoregressive StyleTTS2 + ISTFTNet model turns phonemes and a split style vector into 24 kHz audio in one feed-forward pass; no LLM, no diffusion, no codec, no tokens.

hexgrad · 2025 · Speech / TTS. Read the paper ↗

A free, interactive, animated visual explainer of Kokoro — every exhibit computed from the real formulas, with verbatim quotes from the source.

Questions

What is Kokoro?: TTS without any of this — an 82M non-autoregressive StyleTTS2 + ISTFTNet model turns phonemes and a split style vector into 24 kHz audio in one feed-forward pass; no LLM, no diffusion, no codec, no tokens.
Who published Kokoro, and where?: hexgrad — 2025 (its official release).
Where can I find a visual explainer of Kokoro?: Right here — a free, interactive, animated walkthrough of the whole paper, with exhibits computed from the real formulas and verbatim quotes from the source.

Related explainers

Orpheus TTS
Fish Audio S2
IndexTTS2
CosyVoice 2
Higgs Audio v2
Chatterbox
Spark-TTS