You're looking at a specific version of this model. Jump to the model overview.
wordscenes/whisper-stable-ts:0161eba2
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
audio_path |
string
|
Audio to transcribe
|
|
language |
string
|
en
|
Language to transcribe
|
demucs |
boolean
|
False
|
Whether to preprocess the audio track with Demucs to isolate vocals/remove noise.
|
vad |
boolean
|
True
|
Whether to use Silero VAD to generate timestamp suppression mask.
|
beam_size |
integer
|
5
|
Number of beams in beam search, only applicable when temperature is zero.
|
best_of |
integer
|
5
|
Number of candidates when sampling with non-zero temperature.
|
regroup |
boolean
|
True
|
Whether to regroup all words into segments with more natural boundaries.
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'title': 'Output', 'type': 'string'}