Voxtral TTS¶
Mistral's 4B parameter multilingual text-to-speech model with 20 expressive voice presets across 9 languages. Based on mistralai/Voxtral-4B-TTS-2603.
Model Variants¶
| Model | Format | HuggingFace |
|---|---|---|
mlx-community/Voxtral-4B-TTS-2603-mlx-bf16 |
bfloat16 | Model Card |
Usage¶
Streaming¶
Voxtral TTS supports chunked streaming output for lower-latency playback.
Available Voices¶
English¶
| Voice | Style |
|---|---|
casual_male |
Casual |
casual_female |
Casual |
cheerful_female |
Cheerful |
neutral_male |
Neutral |
neutral_female |
Neutral |
Multilingual¶
| Voice | Language |
|---|---|
fr_male, fr_female |
French |
es_male, es_female |
Spanish |
de_male, de_female |
German |
it_male, it_female |
Italian |
pt_male, pt_female |
Portuguese |
nl_male, nl_female |
Dutch |
ar_male |
Arabic |
hi_male, hi_female |
Hindi |
Supported Languages¶
English, French, Spanish, German, Italian, Portuguese, Dutch, Arabic, Hindi.
License
Voxtral TTS weights are released under CC-BY-NC (non-commercial use). Check the model card for full licensing details.