Search results

Create the page "Large language models" on this wiki! See also the search results found.

Orpheus TTS
...ma-3.2-3B]] architecture, it uses a novel approach of using large-language-models with audio tokens instead of traditional TTS-specific architectures. ...onventional text-to-speech systems by using a modified Meta's Llama-3.2-3B language model as its foundation. It takes in a text prompt and generates audio toke ...

4 KB (486 words) - 16:06, 20 September 2025
VibeVoice
...oice-community/VibeVoice</ref><ref>https://huggingface.co/aoi-ot/VibeVoice-Large</ref> VibeVoice uses a hybrid architecture combining large language models with diffusion-based audio generation. The system uses two specialized toke ...

7 KB (847 words) - 02:53, 23 September 2025
X-Codec
..."Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model," published at AAAI 2025. ...ce Quantization (FSQ) for stability and compatibility with causal language models. ...

4 KB (533 words) - 02:33, 23 December 2025

Navigation menu