daanelson/whisper-train-preprocessor

This runs preprocessing code to generate a dataset you can use to fine-tune Whisper. Specifically, it takes as input either:

two tarballs - one of audio files and one of text files. The transcription for a given audio file should have the same base name - i.e audio1.mp3 corresponds to audio1.txt.

...
{"audio": <URL of audio file>, "sentence": <URL of transcription>}
{"audio": <URL of audio file>, "sentence": <URL of transcription>}
...