You're looking at a specific version of this model. Jump to the model overview.

vm6eji6m4 /podcast-transcribe-zh:244f19ea

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
audio
string
音檔(mp3/wav/m4a,建議 < 60 分鐘)
language
None
zh
語言代碼。zh/ja/ko 用 large-v3-turbo;en 用 distil-large-v3
hotwords
string
熱詞/專有名詞(選填,會注入 Whisper initial_prompt,提升人名/台語準確度)
enable_diarization
boolean
True
是否做多人語者辨識(關掉可省 ~30% 時間)
gap_threshold
number
1.5

Min: 0.1

Max: 5

合併同 speaker 相鄰段的最大間隔秒數
output_format
None
srt
主要輸出格式(會同時回傳 JSON 詳細結果)
hf_token
string
HuggingFace token(diarization 需要)。申請:https://huggingface.co/settings/tokens(Read 即可)。首次使用要先接受兩個模型條款:https://hf.co/pyannote/speaker-diarization-3.1 + https://hf.co/pyannote/segmentation-3.0

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'title': 'Output', 'type': 'object'}