vaibhavs10 / incredibly-fast-whisper

whisper-large-v3, incredibly fast, powered by Hugging Face Transformers! 🤗

  • Public
  • 4.9M runs
  • L40S
  • GitHub
  • License

Input

Run this model in Node.js with one line of code:

npx create-replicate --model=vaibhavs10/incredibly-fast-whisper
or set up a project from scratch
npm install replicate
Set the REPLICATE_API_TOKEN environment variable:
export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:
import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run vaibhavs10/incredibly-fast-whisper using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "vaibhavs10/incredibly-fast-whisper:3ab86df6c8f54c11309d4d1f930ac292bad43ace52d10c80d87eb258b3c9f79c",
  {
    input: {
      task: "transcribe",
      audio: "https://replicate.delivery/pbxt/Js2Fgx9MSOCzdTnzHQLJXj7abLp3JLIG3iqdsYXV24tHIdk8/OSR_uk_000_0050_8k.wav",
      language: "None",
      timestamp: "chunk",
      batch_size: 64,
      diarise_audio: false
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Output

{ "text": " the little tales they tell are false the door was barred locked and bolted as well ripe pears are fit hours fly by much too soon. The room was crowded with a mild wab. The room was crowded with a wild mob. This strong arm shall shield your honour. She blushed when he gave her a white orchid The beetle droned in the hot June sun", "chunks": [ { "text": " the little tales they tell are false the door was barred locked and bolted as well ripe pears are fit hours fly by much too soon. The room was crowded", "timestamp": [ 0, 29.72 ] }, { "text": " with a mild wab. The room was crowded with a wild mob. This strong arm shall shield your", "timestamp": [ 29.72, 38.98 ] }, { "text": " honour. She blushed when he gave her a white orchid The beetle droned in the hot June sun", "timestamp": [ 38.98, 48.52 ] } ] }
Generated in

This output was created using a different version of the model, vaibhavs10/incredibly-fast-whisper:37dfc0d6.

Run time and cost

This model costs approximately $0.011 to run on Replicate, or 90 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 12 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Incredibly Fast Whisper

Powered by 🤗 Transformers, Optimum & flash-attn

TL;DR - Transcribe 150 minutes of audio in 100 seconds - with OpenAI’s Whisper Large v3. Blazingly fast transcription is now a reality!⚡️

Optimisation type Time to Transcribe (150 mins of Audio)
Transformers (fp32) ~31 (31 min 1 sec)
Transformers (fp16 + batching [24] + bettertransformer) ~5 (5 min 2 sec)
Transformers (fp16 + batching [24] + Flash Attention 2) ~2 (1 min 38 sec)
distil-whisper (fp16 + batching [24] + bettertransformer) ~3 (3 min 16 sec)
distil-whisper (fp16 + batching [24] + Flash Attention 2) ~1 (1 min 18 sec)
Faster Whisper (fp16 + beam_size [1]) ~9.23 (9 min 23 sec)
Faster Whisper (8-bit + beam_size [1]) ~8 (8 min 15 sec)