You're looking at a specific version of this model. Jump to the model overview.
vm6eji6m4 /whisper-chinese-pro:002c2c3f
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description |
|---|---|---|---|
| audio |
string
|
Audio file (mp3/wav/m4a/mp4). Or use file_url / file_string instead.
|
|
| file_url |
string
|
|
Audio file URL (alternative to `audio`). Public HTTP/HTTPS URL.
|
| file_string |
string
|
|
Base64-encoded audio (alternative to `audio` / `file_url`).
|
| language |
None
|
|
None
|
| num_speakers |
integer
|
0
Max: 10 |
Number of speakers (1-10). Leave 0 for auto-detect.
|
| prompt |
string
|
|
None
|
| enable_diarization |
boolean
|
True
|
Run speaker diarization (requires hf_token). Set false for ~30% speedup.
|
| gap_threshold |
number
|
1.5
Min: 0.1 Max: 5 |
Merge adjacent same-speaker segments within this gap (seconds).
|
| word_timestamps |
boolean
|
False
|
Include per-word timestamps and per-word probability in output.
|
| output_format |
None
|
srt
|
Primary output format (JSON segments always included).
|
| hf_token |
string
|
None
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'title': 'Output', 'type': 'object'}