Readme
CrisperWhisper
CrisperWhisper is an advanced variant of OpenAIβs Whisper, designed for fast, precise, and verbatim speech recognition with accurate (crisp) word-level timestamps. Unlike the original Whisper, which tends to omit disfluencies and follows more of a intended transcription style, CrisperWhisper aims to transcribe every spoken word exactly as it is, including fillers, pauses, stutters and false starts.
Key Features
- π― Accurate Word-Level Timestamps: Provides precise timestamps, even around disfluencies and pauses, by utilizing an adjusted tokenizer and a custom attention loss during training.
- π Verbatim Transcription: Transcribes every spoken word exactly as it is, including and differentiating fillers like βumβ and βuhβ.
- π Filler Detection: Detects and accurately transcribes fillers.
- π‘οΈ Hallucination Mitigation: Minimizes transcription hallucinations to enhance accuracy.