You're looking at a specific version of this model. Jump to the model overview.
vm6eji6m4 /podcast-transcribe-zh:8d32e83a
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description |
|---|---|---|---|
| audio |
string
|
Audio file (mp3/wav/m4a/mp4), recommended < 60 min
|
|
| language |
None
|
zh
|
Language code. zh/ja/ko use large-v3-turbo, en uses distil-large-v3.
|
| hotwords |
string
|
|
Proper nouns to bias Whisper (e.g. '蔡康永 黃詹 OpenAI Anthropic').
|
| enable_diarization |
boolean
|
True
|
Run speaker diarization (requires hf_token). Disable to skip ~30% time.
|
| gap_threshold |
number
|
1.5
Min: 0.1 Max: 5 |
Merge adjacent same-speaker segments within this gap (seconds).
|
| output_format |
None
|
srt
|
Primary output format. JSON segments are always included.
|
| hf_token |
string
|
|
None
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'title': 'Output', 'type': 'object'}