Transcribe audio to text in multiple languages.
For most needs, use vaibhavs10/incredibly-fast-whisper. It really is fast (10x quicker than original Whisper), cheap, accurate, and supports tons of languages.
Need to label speakers or get word-level timestamps? victor-upmeet/whisperx has you covered. Slightly more expensive than incredibly-fast-whisper but still very fast and useful.
To translate speech between languages, cjwbw/seamless_communication is your friend.
This unified model enables multiple tasks without relying on multiple separate models:
Featured models
openai/gpt-4o-transcribe
A speech-to-text model that uses GPT-4o to transcribe audio
Updated 1 day, 2 hours ago
12.7K runs
victor-upmeet/whisperx
Accelerated transcription, word-level timestamps and diarization with whisperX large-v3
Updated 1 year, 1 month ago
4.4M runs
vaibhavs10/incredibly-fast-whisper
whisper-large-v3, incredibly fast, powered by Hugging Face Transformers! 🤗
Updated 1 year, 7 months ago
17.4M runs
Recommended Models
Recommended Models
openai/gpt-4o-mini-transcribe
A speech-to-text model that uses GPT-4o mini to transcribe audio
Updated 1 day, 2 hours ago
1.4K runs
thomasmol/whisper-diarization
⚡️ Blazing fast audio transcription with speaker diarization | Whisper Large V3 Turbo | word & sentence level timestamps | prompt
Updated 7 months, 3 weeks ago
3.1M runs
openai/whisper
Convert speech in audio to text
Updated 10 months, 2 weeks ago
133.4M runs
nvidia/parakeet-rnnt-1.1b
🗣️ Nvidia + Suno.ai's speech-to-text conversion with high accuracy and efficiency 📝
Updated 1 year, 9 months ago
18.1K runs
adidoes/whisperx-video-transcribe
ASR from video URL based on whisperx using large-v2 model
Updated 2 years ago
19.6K runs
cjwbw/seamless_communication
SeamlessM4T—Massively Multilingual & Multimodal Machine Translation
Updated 2 years, 1 month ago
88K runs
daanelson/whisperx
Accelerated transcription of audio using WhisperX
Updated 2 years, 3 months ago
88.9K runs
m1guelpf/whisper-subtitles
Generate subtitles from an audio file, using OpenAI's Whisper model.
Updated 3 years ago
73.8K runs