Ttswikiadmin: Created page with "'''NeuCodec''' is a neural audio codec developed by Neuphonic, designed for efficient speech tokenization and high-quality audio compression at relatively low bitrates. === Technical Specifications === * '''Bitrate:''' 0.8 kbps * '''Output sample rate:''' 24 kHz * '''Frame rate:''' 50 Hz * '''Quantization:''' Finite Scalar Quantization (FSQ) with a single codebook === Architecture === NeuCodec is largely based on extending the work of X-Codec 2.0. It e..."

2025-12-23T02:33:27Z

Created page with "'''NeuCodec''' is a neural audio codec developed by Neuphonic, designed for efficient speech tokenization and high-quality audio compression at relatively low bitrates. === Technical Specifications === * '''Bitrate:''' 0.8 kbps * '''Output sample rate:''' 24 kHz * '''Frame rate:''' 50 Hz * '''Quantization:''' Finite Scalar Quantization (FSQ) with a single codebook === Architecture === NeuCodec is largely based on extending the work of X-Codec 2.0. It e..."

New page

'''NeuCodec''' is a neural audio codec developed by [[Neuphonic]], designed for efficient speech tokenization and high-quality audio compression at relatively low bitrates.

=== Technical Specifications ===

* '''Bitrate:''' 0.8 kbps
* '''Output sample rate:''' 24 kHz
* '''Frame rate:''' 50 Hz
* '''Quantization:''' Finite Scalar Quantization (FSQ) with a single codebook

=== Architecture ===
NeuCodec is largely based on extending the work of [[X-Codec|X-Codec 2.0]]. It employs a dual-encoder approach, using both audio ([[BigCodec]]) and semantic (Wav2Vec2-BERT) encoders. The FSQ-based design produces a single quantized vector output, making it well-suited for downstream Speech Language Model (SpeechLM) training.

=== Features ===

* Compresses and reconstructs audio with near-inaudible reconstruction loss
* Upsamples from 16 kHz to 24 kHz
* Commercial use permitted
* Pre-encoded datasets available (Emilia-YODAS compressed from 1.7 TB to 41 GB)

=== Applications ===
NeuCodec serves as the audio codec for [[NeuTTS Air]], Neuphonic's on-device text-to-speech model with voice cloning capabilities. It's intended for researchers and developers building text-to-speech systems who need efficient speech tokenization without developing their own codec.

=== Availability ===
Available on Hugging Face and GitHub under the <code>neuphonic/neucodec</code> repository, installable via pip.

[[Category:Neural audio codecs]]

NeuCodec - Revision history