xavriley / sax_transcription

Transcribe saxophone solos directly from audio

  • Public
  • 185 runs
  • T4
  • GitHub

Input

file

Piano audio to transcribe

integer

numerator for time signature, default: 4

Default: 0

string
Shift + Return to add a new line

Optional URL to specify different model weights

Default: "./model.pth"

file

Optional path to syncpoints file

file

Option path to midi file - skips audio and runs score layout only

number

Start time for audio

Default: 0

number

Finish time for audio

Default: 0

boolean

Skip separation step

Default: false

string
Shift + Return to add a new line

Optional label for output filename

string
Shift + Return to add a new line

Optional YouTube URL to fetch audio from - replaces audio_input

string
Shift + Return to add a new line

Device to run inference on

Default: "cuda"

Output

[ "https://storage.googleapis.com/replicate-files/18VVXroRWJIOD9IyHfeYNSrfj63iYC7lEEGnbU9AUw9BVr8kA/tmp_audio__joel_frahm_caravan.mid", "https://storage.googleapis.com/replicate-files/flYM9epm2BgGXkxrwqG8el8RIZRg9PwVmnOiEoqjVVEBVr8kA/tmp_audio__joel_frahm_caravan.xml", "https://storage.googleapis.com/replicate-files/QU6yfnfOxjk0pEyAAd4FQffYPD9sW2ALe9goQevogERLoalnE/tmp_audio__joel_frahm_caravan.json" ]
Generated in

This output was created using a different version of the model, xavriley/sax_transcription:a279cee2.

Run time and cost

This model runs on Nvidia T4 GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This model accompanies the paper “Reconstructing the Charlie Parker Omnibook using an audio-to-score automatic transcription pipeline” (currently under review).

The model takes either

  • an audio file

or

  • a YouTube url

You can optionally specify a start time and finish time for use with YouTube videos.

The model extracts the saxophone audio, transcribes it to MIDI and then converts the MIDI to sheet music. It returns a MusicXML file which you can import into any sheet music program. It also returns a json file containing syncpoints for use with Soundslice.

More details to follow!