You're looking at a specific version of this model. Jump to the model overview.
vm6eji6m4 /podcast-transcribe-zh:244f19ea
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description |
|---|---|---|---|
| audio |
string
|
音檔(mp3/wav/m4a,建議 < 60 分鐘)
|
|
| language |
None
|
zh
|
語言代碼。zh/ja/ko 用 large-v3-turbo;en 用 distil-large-v3
|
| hotwords |
string
|
|
熱詞/專有名詞(選填,會注入 Whisper initial_prompt,提升人名/台語準確度)
|
| enable_diarization |
boolean
|
True
|
是否做多人語者辨識(關掉可省 ~30% 時間)
|
| gap_threshold |
number
|
1.5
Min: 0.1 Max: 5 |
合併同 speaker 相鄰段的最大間隔秒數
|
| output_format |
None
|
srt
|
主要輸出格式(會同時回傳 JSON 詳細結果)
|
| hf_token |
string
|
|
HuggingFace token(diarization 需要)。申請:https://huggingface.co/settings/tokens(Read 即可)。首次使用要先接受兩個模型條款:https://hf.co/pyannote/speaker-diarization-3.1 + https://hf.co/pyannote/segmentation-3.0
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'title': 'Output', 'type': 'object'}