Readme
elevenlabs/turbo-v2.5
High quality, low latency text-to-speech in 32 languages
🎯 Overview
Eleven Turbo v2.5 is ElevenLabs’ high-quality, low-latency text-to-speech (TTS) model, delivering a perfect balance of quality and speed.
- Latency: ~250-300ms – ideal for real-time applications.
- Max Length: 40,000 characters per request (~40 minutes of audio).
- Performance: 3x faster than Multilingual v2 in most languages; 25% faster in English.
- New Languages: Adds Vietnamese, Hungarian, and Norwegian for the first time.
Perfect for voice agents, chatbots, interactive apps, and any scenario needing non-English support with premium quality.
🌍 Supported Languages (32 total)
| Language Code | Language |
|---|---|
en |
English |
ja |
Japanese |
zh |
Mandarin Chinese |
de |
German |
hi |
Hindi |
fr |
French |
ko |
Korean |
pt |
Portuguese |
it |
Italian |
es |
Spanish |
id |
Indonesian |
nl |
Dutch |
tr |
Turkish |
fil |
Filipino |
pl |
Polish |
sv |
Swedish |
bg |
Bulgarian |
ro |
Romanian |
ar |
Arabic |
cs |
Czech |
el |
Greek |
fi |
Finnish |
hr |
Croatian |
ms |
Malay |
sk |
Slovak |
da |
Danish |
ta |
Tamil |
uk |
Ukrainian |
ru |
Russian |
hu |
Hungarian |
no |
Norwegian |
vi |
Vietnamese |
🔧 Inputs
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
text |
string | ✅ Yes | - | The text to convert to speech. Max 40,000 characters. |
voice |
string | ✅ Yes | pNInz6obpgDQGcFmaJgB (Adam) |
Voice ID or name (e.g., “Adam”, “Rachel”). Full voice library. |
stability |
number | No | 0.5 |
Controls voice consistency (0.0 = variable, 1.0 = consistent). 0-1. |
similarity_boost |
number | No | 0.75 |
Boosts similarity to the original voice (0-1). |
style |
number | No | 0 |
Exaggerates the described style (0-1). |
use_speaker_boost |
boolean | No | true |
Applies speaker boost for better clarity. |
optimize_streaming_latency |
number | No | 0 |
Optimize for streaming (0-4; higher = lower latency). |
output_format |
string | No | mp3_44100_128 |
Output audio format (e.g., mp3_44100_128, pcm_16000, wav). |
seed |
integer | No | - | Random seed for reproducibility. |
🎵 Outputs
- Audio: Base64-encoded MP3 (default) or other format audio file.
- Content-Type:
audio/mpegor specified format.
🚀 Use Cases
- Conversational AI: Real-time voice responses in apps and agents.
- Multilingual Content: Narration, dubbing, and localization.
- Interactive Media: Games, virtual assistants, audiobooks.
- Enterprise: High-volume, low-latency production.
Pro Tip: Trade-off between Flash v2.5 (ultra-low latency) and Multilingual v2 (highest quality) – Turbo v2.5 is the sweet spot.