Who published CosyVoice 2, and where?

Du et al. — 2024 (arXiv:2412.10117).

Where can I find a visual explainer of CosyVoice 2?

Right here — a free, interactive, animated walkthrough of the whole paper, with exhibits computed from the real formulas and verbatim quotes from the source.

CosyVoice 2

Streaming and offline speech synthesis in one model — an FSQ semantic tokenizer, a Qwen2.5-0.5B text-speech LM, and a chunk-aware causal flow-matching mel decoder.

Du et al. · 2024 · Speech / TTS. Read the paper ↗

A free, interactive, animated visual explainer of CosyVoice 2 — every exhibit computed from the real formulas, with verbatim quotes from the source.

Questions

What is CosyVoice 2?: Streaming and offline speech synthesis in one model — an FSQ semantic tokenizer, a Qwen2.5-0.5B text-speech LM, and a chunk-aware causal flow-matching mel decoder.
Who published CosyVoice 2, and where?: Du et al. — 2024 (arXiv:2412.10117).
Where can I find a visual explainer of CosyVoice 2?: Right here — a free, interactive, animated walkthrough of the whole paper, with exhibits computed from the real formulas and verbatim quotes from the source.

Related explainers

Orpheus TTS
Fish Audio S2
IndexTTS2
Higgs Audio v2
Chatterbox
Spark-TTS
Kokoro