Whisper transcription plus speaker diarization
This is a modal window.
Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
{ "audio": "https://replicate.delivery/pbxt/IZruuPAVCQh1lI25MIihRwFHN4MvjH7xcBTgnbXUDM1CAY7m/lex-levin-4min.mp3" }
npm install replicate
REPLICATE_API_TOKEN
export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import Replicate from "replicate"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN, });
Run meronym/speaker-transcription using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run( "meronym/speaker-transcription:9950ee297f0fdad8736adf74ada54f63cc5b5bdfd5b2187366910ed5baf1a7a1", { input: { audio: "https://replicate.delivery/pbxt/IZruuPAVCQh1lI25MIihRwFHN4MvjH7xcBTgnbXUDM1CAY7m/lex-levin-4min.mp3" } } ); console.log(output);
To learn more, take a look at the guide on getting started with Node.js.
pip install replicate
import replicate
output = replicate.run( "meronym/speaker-transcription:9950ee297f0fdad8736adf74ada54f63cc5b5bdfd5b2187366910ed5baf1a7a1", input={ "audio": "https://replicate.delivery/pbxt/IZruuPAVCQh1lI25MIihRwFHN4MvjH7xcBTgnbXUDM1CAY7m/lex-levin-4min.mp3" } ) print(output)
To learn more, take a look at the guide on getting started with Python.
curl -s -X POST \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: application/json" \ -H "Prefer: wait" \ -d $'{ "version": "9950ee297f0fdad8736adf74ada54f63cc5b5bdfd5b2187366910ed5baf1a7a1", "input": { "audio": "https://replicate.delivery/pbxt/IZruuPAVCQh1lI25MIihRwFHN4MvjH7xcBTgnbXUDM1CAY7m/lex-levin-4min.mp3" } }' \ https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
{ "completed_at": "2023-04-01T19:22:04.444124Z", "created_at": "2023-04-01T19:20:58.876052Z", "data_removed": false, "error": null, "id": "tuaf755pz5cdjixeyhnb5znqja", "input": { "audio": "https://replicate.delivery/pbxt/IZruuPAVCQh1lI25MIihRwFHN4MvjH7xcBTgnbXUDM1CAY7m/lex-levin-4min.mp3" }, "logs": "pre-processing audio file...\ndiarizing audio file...\npost-processing diarization...\ntranscribing segments...\ntranscribing segment 0:00:00.497812 to 0:00:09.779063\ntranscribing segment 0:00:09.863438 to 0:03:34.962188", "metrics": { "predict_time": 65.262201, "total_time": 65.568072 }, "output": "https://replicate.delivery/pbxt/bQnqvRJCBRo8DNia8APcYivhjFSbel0DWZGlvjCiG5OusxWIA/output.json", "started_at": "2023-04-01T19:20:59.181923Z", "status": "succeeded", "urls": { "get": "https://api.replicate.com/v1/predictions/tuaf755pz5cdjixeyhnb5znqja", "cancel": "https://api.replicate.com/v1/predictions/tuaf755pz5cdjixeyhnb5znqja/cancel" }, "version": "12483517b558629508e03bed77e9c46c6f6e5756715af2b3a4f741f70be45575" }
pre-processing audio file... diarizing audio file... post-processing diarization... transcribing segments... transcribing segment 0:00:00.497812 to 0:00:09.779063 transcribing segment 0:00:09.863438 to 0:03:34.962188
Want to make some of these yourself?
This model is cold. You'll get a fast response if the model is warm and already running, and a slower response if the model is cold and starting up.