nateraw / whisper-large-v3

Whisper is a general-purpose speech recognition model.

  • Public
  • 3.1K runs
  • GitHub
  • Paper
  • License

Input

Output

Run time and cost

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 1 seconds.

Readme

Model details

Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. It was trained on 1 million hours of weakly labeled audio and 4 million hours of pseudolabeled audio collected using Whisper large-v2.

See the full model card here