You're looking at a specific version of this model. Jump to the model overview.

victor-upmeet /whisperx:0e825f91

Input

*file

Audio file

string
Shift + Return to add a new line

ISO code of the language spoken in the audio, specify None to perform language detection

string
Shift + Return to add a new line

Optional text to provide as a prompt for the first window

integer

Parallelization of input audio transcription

Default: 64

number

Temperature to use for sampling

Default: 0

number

VAD onset

Default: 0.5

number

VAD offset

Default: 0.363

boolean

Aligns whisper output to get accurate word-level timestamps

Default: false

boolean

Assign speaker ID labels

Default: false

string
Shift + Return to add a new line

To enable diarization, please enter your HuggingFace token (read). You need to accept the user agreement for the models specified in the README.

integer

Minimum number of speakers if diarization is activated (leave blank if unknown)

integer

Maximum number of speakers if diarization is activated (leave blank if unknown)

boolean

Print out compute/inference times and memory usage information

Default: false

Output

No output yet! Press "Submit" to start a prediction.