Collections

Generate speech

Generate natural-sounding speech from text with these powerful models. Clone your own voice or pick from a variety of languages and speaking styles.

Recommended models

minimax / speech-02-turbo

Text-to-Audio (T2A) that offers voice synthesis, emotional expression, and multilingual capabilities. Designed for real-time applications with low latency

Updated 1 month, 1 week ago

52.7K runs

minimax / voice-cloning

Clone voices to use with Minimax's speech-02-hd and speech-02-turbo

Updated 1 month, 1 week ago

4.1K runs

lucataco / csm-1b

CSM (Conversational Speech Model) is a speech generation model from Sesame that generates RVQ audio codes from text and audio inputs

Updated 2 months, 3 weeks ago

437 runs

lucataco / orpheus-3b-0.1-ft

Orpheus 3B - high quality, emotive Text to Speech

Updated 2 months, 3 weeks ago

16.3K runs

cjwbw / voicecraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Updated 2 months, 4 weeks ago

10.3K runs

fermatresearch / spanish-f5-tts

A F5-TTS fine-tuned for Spanish

Updated 7 months ago

530 runs

x-lance / f5-tts

F5-TTS, the new state-of-the-art in open source voice cloning

Updated 7 months, 4 weeks ago

25K runs

platform-kit / mars5-tts

A novel speech model for insane prosody.

Updated 11 months, 2 weeks ago

479 runs

chenxwh / openvoice

Updated to OpenVoice v2: Versatile Instant Voice Cloning

Updated 1 year ago

60.8K runs

cjwbw / parler-tts

lightweight text-to-speech (TTS) model, trained on 10.5K hours of audio data

Updated 1 year, 1 month ago

2.5K runs

camenduru / metavoice

MetaVoice-1B: 1.2B parameter base model trained on 100K hours of speech

Updated 1 year, 4 months ago

12.2K runs

adirik / styletts2

Generates speech from text

Updated 1 year, 4 months ago

131.1K runs

lucataco / pheme

Pheme generates a variety of conversational voices in 16 kHz for phone-call applications

Updated 1 year, 5 months ago

525 runs

zsxkib / realistic-voice-cloning

Create song covers with any RVC v2 trained AI voice from audio files.

Updated 1 year, 6 months ago

786.5K runs

cjwbw / seamless_​communication

SeamlessM4T—Massively Multilingual & Multimodal Machine Translation

Updated 1 year, 9 months ago

83.6K runs

awerks / neon-tts

NeonAI Coqui AI TTS Plugin.

Updated 1 year, 10 months ago

128.7K runs

suno-ai / bark

🔊 Text-Prompted Generative Audio Model

Updated 2 years, 1 month ago

299.1K runs

afiaka87 / tortoise-tts

Generate speech from text, clone voices from mp3 files. From James Betker AKA "neonbjb".

Updated 2 years, 10 months ago

170.8K runs