You're looking at a specific version of this model. Jump to the model overview.

openai /whisper:91ee9c0c

Input

*file

Audio file

string

Choose a Whisper model.

Default: "large-v2"

string

Choose the format for the transcription

Default: "plain text"

boolean

Translate the text to English when set to True

Default: false

string

language spoken in the audio, specify None to perform language detection

number

temperature to use for sampling

Default: 0

number

optional patience value to use in beam decoding, as in https://arxiv.org/abs/2204.05424, the default (1.0) is equivalent to conventional beam search

string
Shift + Return to add a new line

comma-separated list of token ids to suppress during sampling; '-1' will suppress most special characters except common punctuations

Default: "-1"

string
Shift + Return to add a new line

optional text to provide as a prompt for the first window.

boolean

if True, provide the previous output of the model as a prompt for the next window; disabling may make the text inconsistent across windows, but the model becomes less prone to getting stuck in a failure loop

Default: true

number

temperature to increase when falling back when the decoding fails to meet either of the thresholds below

Default: 0.2

number

if the gzip compression ratio is higher than this value, treat the decoding as failed

Default: 2.4

number

if the average log probability is lower than this value, treat the decoding as failed

Default: -1

number

if the probability of the <|nospeech|> token is higher than this value AND the decoding has failed due to `logprob_threshold`, consider the segment as silence

Default: 0.6

Output

No output yet! Press "Submit" to start a prediction.