You're looking at a specific version of this model. Jump to the model overview.
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
audio |
string
|
Audio file
|
|
batch_size |
integer
|
16
|
Parallelization of input audio transcription
|
align_output |
boolean
|
True
|
Use if you need word-level timing and not just batched transcription
|
only_text |
boolean
|
False
|
Set if you only want to return text; otherwise, segment metadata will be returned as well.
|
need_separate_speaker |
boolean
|
False
|
Set True and Set Min and Max Speaker If you want seperate Speaker
|
huggingface_token |
string
|
|
Set your huggingface access token If you want seperate Speaker
|
min_speakers |
integer
|
1
|
Check Min Speaker of input audio transcription
|
max_speakers |
integer
|
3
|
Check Max Speaker of input audio transcription
|
debug |
boolean
|
False
|
Print out memory usage information.
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'title': 'Output', 'type': 'string'}