hnesk/whisper-wordtimestamps:4a60104c | Run with an API on Replicate

You're looking at a specific version of this model. Jump to the model overview.

hnesk /whisper-wordtimestamps:4a60104c

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
audio	string		Audio file
model	None	base	Choose a Whisper model.
language	None		language spoken in the audio, specify None to perform language detection
temperature	number	0	temperature to use for sampling
patience	number		optional patience value to use in beam decoding, as in https://arxiv.org/abs/2204.05424, the default (1.0) is equivalent to conventional beam search
suppress_tokens	string	-1	comma-separated list of token ids to suppress during sampling; '-1' will suppress most special characters except common punctuations
initial_prompt	string		optional text to provide as a prompt for the first window.
condition_on_previous_text	boolean	True	if True, provide the previous output of the model as a prompt for the next window; disabling may make the text inconsistent across windows, but the model becomes less prone to getting stuck in a failure loop
temperature_increment_on_fallback	number	0.2	temperature to increase when falling back when the decoding fails to meet either of the thresholds below
compression_ratio_threshold	number	2.4	if the gzip compression ratio is higher than this value, treat the decoding as failed
logprob_threshold	number	-1	if the average log probability is lower than this value, treat the decoding as failed
no_speech_threshold	number	0.6	if the probability of the <\|nospeech\|> token is higher than this value AND the decoding has failed due to `logprob_threshold`, consider the segment as silence
word_timestamps	boolean	False	Extract word-level timestamps using the cross-attention pattern and dynamic time warping, and include the timestamps for each word in each segment.
prepend_punctuations	string	"'“¿([{-	If word_timestamps is True, merge these punctuation symbols with the next word
append_punctuations	string	"'.。,，!！?？:：”)]}、	If word_timestamps is True, merge these punctuation symbols with the previous word

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema

{'properties': {'detected_language': {'title': 'Detected Language',
                                      'type': 'string'},
                'segments': {'title': 'Segments'},
                'transcription': {'title': 'Transcription', 'type': 'string'}},
 'required': ['detected_language', 'transcription'],
 'title': 'ModelOutput',
 'type': 'object'}