daanelson/whisper-train-preprocessor | Run with an API on Replicate

This model runs on Nvidia T4 GPU hardware. We don't yet have enough runs of this model to provide performance information.

This runs preprocessing code to generate a dataset you can use to fine-tune Whisper. Specifically, it takes as input either:

two tarballs - one of audio files and one of text files. The transcription for a given audio file should have the same base name - i.e audio1.mp3 corresponds to audio1.txt.

OR

...
{"audio": <URL of audio file>, "sentence": <URL of transcription>}
{"audio": <URL of audio file>, "sentence": <URL of transcription>}
...