turian/whisply | API reference

turian / whisply

Transcribe, translate, annotate and subtitle audio and video files with OpenAI's Whisper ... fast! (Updated 5 months, 1 week ago)

Public
27 runs
GitHub
License

Iterate in playground

Run with an API

Playground API Examples README Versions

Run turian/whisply with an API

Use one of our client libraries to get started quickly. Clicking on a library will take you to the Playground tab where you can tweak different inputs, see the results, and copy the corresponding code to use in your own project.

Input schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

Field	Type	Default value	Description
audio_file	string		Audio file to transcribe
language	string		Language code (e.g., 'en', 'fr', 'de')
model	string (enum)	distil-large-v3 Options: tiny, tiny-en, base, base-en, small, small-en, distil-small-en, medium, medium-en, distil-medium-en, large, large-v1, large-v2, distil-large-v2, large-v3, distil-large-v3, large-v3-turbo	Whisper model to use
subtitle	boolean	False	Generate subtitles (.srt, .vtt)
sub_length	integer	5 Min: 1	Subtitle segment length in words
translate	boolean	False	Translate to English
annotate	boolean	False	Enable speaker annotation (requires HF token)
num_speakers	integer	Min: 2	Number of speakers to annotate (auto-detection if None)
hf_token	string		HuggingFace Access token for speaker annotation
verbose	boolean	False	Print text chunks during transcription
post_correction	string		Path to YAML file for post-correction

{
  "type": "object",
  "title": "Input",
  "required": [
    "audio_file"
  ],
  "properties": {
    "model": {
      "enum": [
        "tiny",
        "tiny-en",
        "base",
        "base-en",
        "small",
        "small-en",
        "distil-small-en",
        "medium",
        "medium-en",
        "distil-medium-en",
        "large",
        "large-v1",
        "large-v2",
        "distil-large-v2",
        "large-v3",
        "distil-large-v3",
        "large-v3-turbo"
      ],
      "type": "string",
      "title": "model",
      "description": "Whisper model to use",
      "default": "distil-large-v3",
      "x-order": 2
    },
    "verbose": {
      "type": "boolean",
      "title": "Verbose",
      "default": false,
      "x-order": 9,
      "description": "Print text chunks during transcription"
    },
    "annotate": {
      "type": "boolean",
      "title": "Annotate",
      "default": false,
      "x-order": 6,
      "description": "Enable speaker annotation (requires HF token)"
    },
    "hf_token": {
      "type": "string",
      "title": "Hf Token",
      "x-order": 8,
      "description": "HuggingFace Access token for speaker annotation"
    },
    "language": {
      "type": "string",
      "title": "Language",
      "x-order": 1,
      "description": "Language code (e.g., 'en', 'fr', 'de')"
    },
    "subtitle": {
      "type": "boolean",
      "title": "Subtitle",
      "default": false,
      "x-order": 3,
      "description": "Generate subtitles (.srt, .vtt)"
    },
    "translate": {
      "type": "boolean",
      "title": "Translate",
      "default": false,
      "x-order": 5,
      "description": "Translate to English"
    },
    "audio_file": {
      "type": "string",
      "title": "Audio File",
      "format": "uri",
      "x-order": 0,
      "description": "Audio file to transcribe"
    },
    "sub_length": {
      "type": "integer",
      "title": "Sub Length",
      "default": 5,
      "minimum": 1,
      "x-order": 4,
      "description": "Subtitle segment length in words"
    },
    "num_speakers": {
      "type": "integer",
      "title": "Num Speakers",
      "minimum": 2,
      "x-order": 7,
      "description": "Number of speakers to annotate (auto-detection if None)"
    },
    "post_correction": {
      "type": "string",
      "title": "Post Correction",
      "format": "uri",
      "x-order": 10,
      "description": "Path to YAML file for post-correction"
    }
  }
}

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema

{
  "type": "string",
  "title": "Output",
  "format": "uri"
}