vm6eji6m4/podcast-transcribe-zh

Chinese/Taiwanese podcast transcription with speaker diarization (Whisper large-v3-turbo + pyannote 3.1)

Public
21 runs

⚠️ DEPRECATED — please use whisper-chinese-pro

This model has been superseded by:

The new models include: - ✅ Built-in vocabulary from 教育部辭典 (172k Mandarin + 14k Taiwanese + 14k Hakka words) - ✅ FrequencyWords corpus for en/ja/ko (5k each) - ✅ Word-level timestamps + confidence - ✅ num_speakers control (1-10 or auto) - ✅ 3 input methods (file / URL / Base64) - ✅ Secret-typed hf_token (masked in UI / examples) - ✅ Realtime factor in output - ✅ VTT output format - ✅ Auto language detection

Please migrate to: https://replicate.com/vm6eji6m4/whisper-chinese-pro


(Legacy model. No further updates will be made.)

Model created
Model updated