You're looking at a specific version of this model. Jump to the model overview.

audioscrape /whisperx:f2fa4577

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
audio
string
Audio file (supports up to 4 hours)
min_speakers
integer

Min: 1

Max: 20

Minimum number of speakers (None = auto-detect)
max_speakers
integer

Min: 1

Max: 20

Maximum number of speakers (None = auto-detect)
language
string
Language code (e.g., 'en'). Leave empty for auto-detect
huggingface_token
string
HuggingFace token for speaker diarization (required)
batch_size
integer
8

Min: 1

Max: 32

Batch size for transcription (lower for long audio)
enable_diarization
boolean
True
Enable speaker diarization
return_word_timestamps
boolean
True
Return word-level timestamps

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'additionalProperties': True, 'title': 'Output', 'type': 'object'}