nicknaskida/incredibly-fast-whisper:9613f673 | Run with an API on Replicate

You're looking at a specific version of this model. Jump to the model overview.

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
audio	string		Audio file
task	None	transcribe	Task to perform: transcribe or translate to another language.
language	None	None	Language spoken in the audio, specify 'None' to perform language detection.
batch_size	integer	24	Number of parallel batches you want to compute. Reduce if you face OOMs.
timestamp	None	chunk	Whisper supports both chunked as well as word level timestamps.
diarise_audio	boolean	False	Use Pyannote.audio to diarise the audio clips. You will need to provide hf_token below too.
num_speakers	integer	Min: 1	Exact number of speakers present in the audio file. Useful when the exact number of participants in the conversation is known. Must be at least 1. Cannot be used together with min_speakers or max_speakers. (default: None)
min_speakers	integer	Min: 1	Minimum number of speakers system should consider in audio file. Must be at least 1. Cannot be used together with num_speakers and be greater than max_speakers. (default: None)
max_speakers	integer	Min: 1	Maximum number of speakers system should consider in audio file. Must be at least 1. Cannot be used together with num_speakers and be less than min_speakers. (default: None)
hf_token	string		Provide a hf.co/settings/token for Pyannote.audio to diarise the audio clips. You need to agree to the terms in 'https://huggingface.co/pyannote/speaker-diarization-3.1' and 'https://huggingface.co/pyannote/segmentation-3.0' first.

The shape of the response you’ll get when you run this model with an API.

Schema

{'title': 'Output'}