You're looking at a specific version of this model. Jump to the model overview.

nicknaskida /incredibly-fast-whisper:968947af

Input

*file

Audio file

string

Task to perform: transcribe or translate to another language.

Default: "transcribe"

string

Language spoken in the audio, specify 'None' to perform language detection.

Default: "None"

integer

Number of parallel batches you want to compute. Reduce if you face OOMs.

Default: 24

string

Whisper supports both chunked as well as word level timestamps.

Default: "chunk"

boolean

Use Pyannote.audio to diarise the audio clips. You will need to provide hf_token below too.

Default: false

string
Shift + Return to add a new line

Provide a hf.co/settings/token for Pyannote.audio to diarise the audio clips. You need to agree to the terms in 'https://huggingface.co/pyannote/speaker-diarization-3.1' and 'https://huggingface.co/pyannote/segmentation-3.0' first.

integer
(minimum: 1)

Exact number of speakers present in the audio file. Useful when the exact number of participants in the conversation is known. Must be at least 1. Cannot be used together with min_speakers or max_speakers. (default: None)

integer
(minimum: 1)

Minimum number of speakers system should consider in audio file. Must be at least 1. Cannot be used together with num_speakers and be greater than max_speakers. (default: None)

integer
(minimum: 1)

Maximum number of speakers system should consider in audio file. Must be at least 1. Cannot be used together with num_speakers and be less than min_speakers. (default: None)

Output

No output yet! Press "Submit" to start a prediction.