Generate speech
Generate natural-sounding speech from text with these powerful models. Clone your own voice or pick from a variety of languages and speaking styles.
Featured models

minimax / speech-02-hd
Text-to-Audio (T2A) that offers voice synthesis, emotional expression, and multilingual capabilities. Optimized for high-fidelity applications like voiceovers and audiobooks.
Updated 1 month, 1 week ago

jaaari / kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
Updated 4 months, 2 weeks ago

lucataco / xtts-v2
Coqui XTTS-v2: Multilingual Text To Speech Voice Cloning
Updated 1 year, 6 months ago
Recommended models

minimax / speech-02-turbo
Text-to-Audio (T2A) that offers voice synthesis, emotional expression, and multilingual capabilities. Designed for real-time applications with low latency
Updated 1 month, 1 week ago

minimax / voice-cloning
Clone voices to use with Minimax's speech-02-hd and speech-02-turbo
Updated 1 month, 1 week ago

lucataco / csm-1b
CSM (Conversational Speech Model) is a speech generation model from Sesame that generates RVQ audio codes from text and audio inputs
Updated 2 months, 3 weeks ago

lucataco / orpheus-3b-0.1-ft
Orpheus 3B - high quality, emotive Text to Speech
Updated 2 months, 3 weeks ago

cjwbw / voicecraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Updated 2 months, 4 weeks ago

fermatresearch / spanish-f5-tts
A F5-TTS fine-tuned for Spanish
Updated 7 months ago

x-lance / f5-tts
F5-TTS, the new state-of-the-art in open source voice cloning
Updated 7 months, 4 weeks ago

platform-kit / mars5-tts
A novel speech model for insane prosody.
Updated 11 months, 2 weeks ago

chenxwh / openvoice
Updated to OpenVoice v2: Versatile Instant Voice Cloning
Updated 1 year ago

cjwbw / parler-tts
lightweight text-to-speech (TTS) model, trained on 10.5K hours of audio data
Updated 1 year, 1 month ago

camenduru / metavoice
MetaVoice-1B: 1.2B parameter base model trained on 100K hours of speech
Updated 1 year, 4 months ago

adirik / styletts2
Generates speech from text
Updated 1 year, 4 months ago

lucataco / pheme
Pheme generates a variety of conversational voices in 16 kHz for phone-call applications
Updated 1 year, 5 months ago

zsxkib / realistic-voice-cloning
Create song covers with any RVC v2 trained AI voice from audio files.
Updated 1 year, 6 months ago

cjwbw / seamless_communication
SeamlessM4T—Massively Multilingual & Multimodal Machine Translation
Updated 1 year, 9 months ago

awerks / neon-tts
NeonAI Coqui AI TTS Plugin.
Updated 1 year, 10 months ago

suno-ai / bark
🔊 Text-Prompted Generative Audio Model
Updated 2 years, 1 month ago

afiaka87 / tortoise-tts
Generate speech from text, clone voices from mp3 files. From James Betker AKA "neonbjb".
Updated 2 years, 10 months ago