You're looking at a specific version of this model. Jump to the model overview.
romanfurman6 /whisperx-multi-chunk:d1ad6119
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
audio_urls |
array
|
Array of public audio urls to process
|
|
total_duration_seconds |
number
|
Total duration of the complete audio in seconds
|
|
chunk_size_seconds |
number
|
Duration of each chunk in seconds (used for timestamp calculation). Latest chunk can be shorter, it will be calculated based on the total duration and the number of chunks.
|
|
language |
string
|
ISO code of the language spoken in the audio, specify None to perform language detection
|
|
language_detection_min_prob |
number
|
0.7
|
Minimum probability for recursive language detection
|
language_detection_max_tries |
integer
|
5
|
Maximum retries for recursive language detection
|
initial_prompt |
string
|
Optional text prompt for the first window
|
|
batch_size |
integer
|
32
|
Parallelization of input audio transcription
|
temperature |
number
|
0.2
|
Temperature to use for sampling
|
vad_onset |
number
|
0.5
|
VAD onset threshold
|
vad_offset |
number
|
0.363
|
VAD offset threshold
|
align_output |
boolean
|
False
|
Whether to align output for word-level timestamps
|
diarization |
boolean
|
False
|
Whether to perform diarization
|
huggingface_access_token |
string
|
HuggingFace token for diarization
|
|
min_speakers |
integer
|
Minimum number of speakers if diarization is activated
|
|
max_speakers |
integer
|
Maximum number of speakers if diarization is activated
|
|
debug |
boolean
|
True
|
Print debug information
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'properties': {'detected_language': {'title': 'Detected Language',
'type': 'string'},
'processing_time': {'title': 'Processing Time',
'type': 'number'},
'segments': {'title': 'Segments'},
'total_chunks': {'title': 'Total Chunks', 'type': 'integer'}},
'required': ['segments',
'detected_language',
'total_chunks',
'processing_time'],
'title': 'Output',
'type': 'object'}