openai
/
whisper
Convert speech in audio to text
Run openai/whisper with an API
Input schema
Audio file
Language spoken in the audio, specify 'auto' for automatic language detection
- Default
- "auto"
optional patience value to use in beam decoding, as in https://arxiv.org/abs/2204.05424, the default (1.0) is equivalent to conventional beam search
Translate the text to English when set to True
temperature to use for sampling
Choose the format for the transcription
- Default
- "plain text"
optional text to provide as a prompt for the first window.
comma-separated list of token ids to suppress during sampling; '-1' will suppress most special characters except common punctuations
- Default
- "-1"
if the average log probability is lower than this value, treat the decoding as failed
- Default
- -1
if the probability of the <|nospeech|> token is higher than this value AND the decoding has failed due to `logprob_threshold`, consider the segment as silence
- Default
- 0.6
if True, provide the previous output of the model as a prompt for the next window; disabling may make the text inconsistent across windows, but the model becomes less prone to getting stuck in a failure loop
- Default
- true
if the gzip compression ratio is higher than this value, treat the decoding as failed
- Default
- 2.4
temperature to increase when falling back when the decoding fails to meet either of the thresholds below
- Default
- 0.2