You're looking at a specific version of this model. Jump to the model overview.

thomasmol /whisper-diarization:c558e6f7

Input

string
Shift + Return to add a new line

Either provide: Base64 encoded audio file,

string
Shift + Return to add a new line

Or provide: A direct audio file URL

file

Or an audio file

boolean

Group segments of same speaker shorter apart than 2 seconds

Default: true

integer
(minimum: 1, maximum: 50)

Number of speakers, leave empty to autodetect.

string
Shift + Return to add a new line

Language of the spoken words as a language code like 'en'. Leave empty to auto detect language.

string
Shift + Return to add a new line

Prompt, provide names, acronyms and loanwords in a list. Use punctuation for best accuracy.

Default: "AI, Thomas, Audiogest."

integer
(minimum: 0)

Offset in seconds, used for chunked inputs

Default: 0

Output

No output yet! Press "Submit" to start a prediction.