Search results

Jump to navigation Jump to search
  • ...including a wide range of speaking styles for more natural and spontaneous speech generation. ...216k hours of speech, making it one of the largest openly-available speech datasets. Emilia-Large combines the original 101k-hour Emilia dataset (licensed unde ...
    4 KB (554 words) - 03:43, 19 September 2025
  • ...is a neural audio codec developed by [[Neuphonic]], designed for efficient speech tokenization and high-quality audio compression at relatively low bitrates. ...ces a single quantized vector output, making it well-suited for downstream Speech Language Model (SpeechLM) training. ...
    1 KB (176 words) - 02:33, 23 December 2025
  • .../|website=https://indextts2.org}}'''IndexTTS2''' is an open-source text-to-speech model developed by Bilibili's AI Platform Department loosely based on [[Tor ...nally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech."<ref>https://arxiv.org/abs/2506.21619</ref> ...
    8 KB (986 words) - 20:46, 21 September 2025