elevenlabs/flash-v2.5

ElevenLabs's fastest speech synthesis model

552 runs

ElevenLabs Flash v2.5 is the fastest speech synthesis model from ElevenLabs, designed for real-time applications and conversational AI. It delivers high-quality speech with ultra-low latency (~75ms) across 32 languages.

Flash v2.5 balances speed and quality, making it ideal for interactive applications while maintaining natural-sounding output.

Key features

  • Ultra-low latency: ~75ms, perfect for real-time voice agents and chatbots
  • 32 languages: All languages from Multilingual v2 plus Hungarian, Norwegian, and Vietnamese
  • 40,000 character limit: Generate up to ~40 minutes of audio per request
  • 50% lower price per character compared to Multilingual v2

Supported languages (32)

Code Language Code Language
en English pl Polish
ja Japanese sv Swedish
zh Mandarin Chinese bg Bulgarian
de German ro Romanian
hi Hindi ar Arabic
fr French cs Czech
ko Korean el Greek
pt Portuguese fi Finnish
it Italian hr Croatian
es Spanish ms Malay
id Indonesian sk Slovak
nl Dutch da Danish
tr Turkish ta Tamil
fil Filipino uk Ukrainian
ru Russian hu Hungarian
no Norwegian vi Vietnamese

Inputs

Parameter Type Default Description
prompt string The text to convert to speech
voice string Rachel Voice choice for speech generation
language_code string en Language code (e.g., en, es, fr)
stability number 0.5 Voice consistency (0.0–1.0)
similarity_boost number 0.75 Similarity to the original voice (0.0–1.0)
style number 0 Style exaggeration (0.0–1.0)
speed number 1 Speed of speech (0.7–1.2)
previous_text string Previous text for context
next_text string Next text for context

Use cases

  • Voice agents and chatbots: Ultra-low latency makes it perfect for conversational AI
  • Interactive apps: Games and applications that need immediate audio response
  • Large-scale processing: Efficient for bulk text-to-speech conversion
  • Multilingual content: Narration, dubbing, and localization across 32 languages

Choosing between ElevenLabs models

  • Flash v2.5: Fastest (~75ms), best for real-time and cost-sensitive use cases
  • Turbo v2.5: Balanced quality and speed (~250ms), same language and character support
  • Multilingual v2: Highest quality, best for professional content and audiobooks
  • v3: Most expressive, with 70+ languages and multi-speaker dialogue support
Model created