stayallive / whisper-subtitles

Generate subtitles (.srt and .vtt) from audio files using OpenAI's Whisper models.

  • Public
  • 4.5K runs
  • GitHub
  • License

Input

Output

Run time and cost

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 5 minutes. The predict time for this model varies significantly based on the inputs.

Readme

Generate subtitles (.srt and .vtt) from audio files using OpenAI’s Whisper models.

Using faster-whisper, a reimplementation of OpenAI’s Whisper model using CTranslate2, which is a fast inference engine for Transformer models.

This is a fork of m1guelpf/whisper-subtitles with added support for VAD, selecting a language, use the language specific models and download the .vtt/.srt files directly from the result.