turian / whisply

Transcribe, translate, annotate and subtitle audio and video files with OpenAI's Whisper ... fast!

  • Public
  • 27 runs
  • T4
  • GitHub
  • License

Run turian/whisply with an API

Use one of our client libraries to get started quickly. Clicking on a library will take you to the Playground tab where you can tweak different inputs, see the results, and copy the corresponding code to use in your own project.

Input schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

Field Type Default value Description
audio_file
string
Audio file to transcribe
language
string
Language code (e.g., 'en', 'fr', 'de')
model
string (enum)
distil-large-v3

Options:

tiny, tiny-en, base, base-en, small, small-en, distil-small-en, medium, medium-en, distil-medium-en, large, large-v1, large-v2, distil-large-v2, large-v3, distil-large-v3, large-v3-turbo

Whisper model to use
subtitle
boolean
False
Generate subtitles (.srt, .vtt)
sub_length
integer
5

Min: 1

Subtitle segment length in words
translate
boolean
False
Translate to English
annotate
boolean
False
Enable speaker annotation (requires HF token)
num_speakers
integer

Min: 2

Number of speakers to annotate (auto-detection if None)
hf_token
string
HuggingFace Access token for speaker annotation
verbose
boolean
False
Print text chunks during transcription
post_correction
string
Path to YAML file for post-correction

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{
  "type": "string",
  "title": "Output",
  "format": "uri"
}