spladder87/kblab-whisper-diarization

Public
49 runs

KB-Whisper Large on Replicate

This repository provides access to KB-Whisper Large on Replicate. All credits for the model development, training, and research go to KBLab and the National Library of Sweden. I have only made this model available on Replicate for easier access.

Overview

KB-Whisper Large is a state-of-the-art automatic speech recognition (ASR) model optimized for Swedish. It was trained on over 50,000 hours of Swedish audio, delivering substantial improvements over similar models.

For complete details and updates, please refer to the original model page on Hugging Face.

Features

Automatic Speech Recognition: High-accuracy transcription of Swedish speech. Optimized for Swedish: Tailored to capture the nuances of the Swedish language. Efficient Inference: Utilizes safetensors and FP16 precision for enhanced performance. Multiple Checkpoints: Stage 2 (Default): Finetuned with strict quality filters. Stage 1 (Pretraining): Continued pretraining checkpoint (accessible via the revision pretrained-checkpoint). Model Details

Parameters: 1.61B Tensor Type: FP16 License: Apache-2.0

License

This model is distributed under the Apache-2.0 License.

Acknowledgements

KBLab: For the development, training, and research behind KB-Whisper Large. National Library of Sweden: For providing the extensive Swedish audio datasets. Original Model: For further details and updates, please visit the KBLab/kb-whisper-large page on Hugging Face. This README serves as a guide to using KB-Whisper Large on Replicate. While the model is available here for convenience, please remember that all credit for its creation belongs to KBLab and their collaborators.

Model created