Transcribe speech
Transcribe audio to text in multiple languages.
Our pick: incredibly-fast-whisper
For most needs, use incredibly-fast-whisper. It really is fast (10x quicker than original Whisper), cheap, accurate, and supports tons of languages.
For speaker labels: whisper-diarization
Need to label speakers or get word-level timestamps? whisper-diarization has you covered. Pricier than incredibly-fast-whisper but worth it for the extra features.
For translation: seamless_communication
To translate speech between languages, seamless_communication is your friend. Go from Spanish audio to German text or French speech with ease.
Recommended models
openai / whisper
Convert speech in audio to text
vaibhavs10 / incredibly-fast-whisper
whisper-large-v3, incredibly fast, powered by Hugging Face Transformers! 🤗
thomasmol / whisper-diarization
⚡️ Fast audio transcription | whisper large-v3 | speaker diarization | word & sentence level timestamps | prompt | hotwords
victor-upmeet / whisperx
Accelerated transcription, word-level timestamps and diarization with whisperX large-v3
cjwbw / seamless_communication
SeamlessM4T—Massively Multilingual & Multimodal Machine Translation
nvlabs / parakeet-rnnt-1.1b
🗣️ Nvidia + Suno.ai's speech-to-text conversion with high accuracy and efficiency 📝