xavriley/sax_transcription | Run with an API on Replicate

xavriley / sax_transcription

Transcribe saxophone solos directly from audio

Cold

Public
202 runs
T4
GitHub

Run with an API

Playground API Examples README Versions

Input

audio_input

file

Piano audio to transcribe

beats_per_bar

integer

numerator for time signature, default: 4

Default: 0

model_path

string

Shift + Return to add a new line

Optional URL to specify different model weights

Default: "./model.pth"

syncpoints_path

file

Optional path to syncpoints file

midi_path

file

Option path to midi file - skips audio and runs score layout only

start_time

number

Start time for audio

Default: 0

finish_time

number

Finish time for audio

Default: 0

skip_separation

boolean

Skip separation step

Default: false

file_label

string

Shift + Return to add a new line

Optional label for output filename

yt_url

string

Shift + Return to add a new line

https://www.youtube.com/watch?v=GKGpyi-R-SMhttps://www.youtube.com/watch?v=GKGpyi-R-SM

Optional YouTube URL to fetch audio from - replaces audio_input

device

string

Shift + Return to add a new line

Device to run inference on

Default: "cuda"

Run this model in Node.js with one line of code:

npx create-replicate --model=xavriley/sax_transcription

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run xavriley/sax_transcription using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "xavriley/sax_transcription:a0d3bd7b758bfa6ccab80fe915f3c67354f4df9a38d08203a93c70fa245fb73d",
  {
    input: {
      device: "cuda",
      yt_url: "https://www.youtube.com/watch?v=GKGpyi-R-SM",
      file_label: "joel_frahm_caravan",
      model_path: "./model.pth",
      start_time: 120,
      finish_time: 250,
      beats_per_bar: 4,
      skip_separation: false
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run xavriley/sax_transcription using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "xavriley/sax_transcription:a0d3bd7b758bfa6ccab80fe915f3c67354f4df9a38d08203a93c70fa245fb73d",
    input={
        "device": "cuda",
        "yt_url": "https://www.youtube.com/watch?v=GKGpyi-R-SM",
        "file_label": "joel_frahm_caravan",
        "model_path": "./model.pth",
        "start_time": 120,
        "finish_time": 250,
        "beats_per_bar": 4,
        "skip_separation": False
    }
)

print(output)

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run xavriley/sax_transcription using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "xavriley/sax_transcription:a0d3bd7b758bfa6ccab80fe915f3c67354f4df9a38d08203a93c70fa245fb73d",
    "input": {
      "device": "cuda",
      "yt_url": "https://www.youtube.com/watch?v=GKGpyi-R-SM",
      "file_label": "joel_frahm_caravan",
      "model_path": "./model.pth",
      "start_time": 120,
      "finish_time": 250,
      "beats_per_bar": 4,
      "skip_separation": false
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

[ "https://storage.googleapis.com/replicate-files/18VVXroRWJIOD9IyHfeYNSrfj63iYC7lEEGnbU9AUw9BVr8kA/tmp_audio__joel_frahm_caravan.mid", "https://storage.googleapis.com/replicate-files/flYM9epm2BgGXkxrwqG8el8RIZRg9PwVmnOiEoqjVVEBVr8kA/tmp_audio__joel_frahm_caravan.xml", "https://storage.googleapis.com/replicate-files/QU6yfnfOxjk0pEyAAd4FQffYPD9sW2ALe9goQevogERLoalnE/tmp_audio__joel_frahm_caravan.json" ]

{
  "completed_at": "2024-03-09T00:09:04.685712Z",
  "created_at": "2024-03-09T00:03:20.120096Z",
  "data_removed": false,
  "error": null,
  "id": "nfp3mtzbbmwc6ogd4cknezw2e4",
  "input": {
    "device": "cuda",
    "yt_url": "https://www.youtube.com/watch?v=GKGpyi-R-SM",
    "file_label": "joel_frahm_caravan",
    "model_path": "./model.pth",
    "start_time": 120,
    "finish_time": 250,
    "beats_per_bar": 4,
    "skip_separation": false
  },
  "logs": "[youtube] Extracting URL: https://www.youtube.com/watch?v=GKGpyi-R-SM\n[youtube] GKGpyi-R-SM: Downloading webpage\n[youtube] GKGpyi-R-SM: Downloading ios player API JSON\n[youtube] GKGpyi-R-SM: Downloading android player API JSON\nWARNING: [youtube] YouTube said: ERROR - Precondition check failed.\nWARNING: [youtube] HTTP Error 400: Bad Request. Retrying (1/3)...\n[youtube] GKGpyi-R-SM: Downloading android player API JSON\n[youtube] GKGpyi-R-SM: Downloading m3u8 information\n[info] GKGpyi-R-SM: Downloading 1 format(s): 251\n[info] GKGpyi-R-SM: Downloading 1 time ranges: 120.0-250.0\n[download] Destination: tmp_audio\nInput #0, matroska,webm, from 'https://rr5---sn-qxo7rn7k.googlevideo.com/videoplayback?expire=1709964400&ei=EKjrZeu7DMSglu8PruSNyAM&ip=34.170.22.180&id=o-ADSBbuzV-F5UJ3jH11RnkjrbOqzwDfAwB4AJnaOA7Mdg&itag=251&source=youtube&requiressl=yes&xpc=EgVo2aDSNQ%3D%3D&mh=8A&mm=31%2C26&mn=sn-qxo7rn7k%2Csn-a5msenl7&ms=au%2Conr&mv=m&mvi=5&pl=17&initcwndbps=590000&spc=UWF9f6ZozetGKTFpB9BAqcHEGViLWwIY1y9hMYpttQSP-z8&vprv=1&svpuc=1&mime=audio%2Fwebm&gir=yes&clen=6329414&dur=472.721&lmt=1496516991838821&mt=1709942475&fvip=1&keepalive=yes&fexp=24007246&c=ANDROID&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cxpc%2Cspc%2Cvprv%2Csvpuc%2Cmime%2Cgir%2Cclen%2Cdur%2Clmt&sig=AJfQdSswRgIhANPFgsl5nBUzhC-PW8ZO5GZnzsK9l7L40VjFp9N7PCYyAiEAoHgVLGAvAT2smIYWAKcRrSDXvo4CvqXjFX3FdHMqfFU%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=APTiJQcwRgIhALhNObAJeK5TJNP0GeAX6yUr9yxg8kKn_nQycldxfN9RAiEApl3529WJmfeiMtCLDG0dP2hSUc9YT1brGV73cfY45J0%3D':\nMetadata:\nencoder         : google\nDuration: 00:07:52.72, start: -0.007000, bitrate: 107 kb/s\nStream #0:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)\nStream mapping:\nStream #0:0 -> #0:0 (opus (native) -> opus (libopus))\nPress [q] to stop, [?] for help\n[libopus @ 0x5c15cb9f7340] No bit rate set. Defaulting to 96000 bps.\nOutput #0, webm, to 'file:tmp_audio.part':\nMetadata:\nencoder         : Lavf58.76.100\nStream #0:0(eng): Audio: opus, 48000 Hz, stereo, flt, 96 kb/s (default)\nMetadata:\nencoder         : Lavc58.134.100 libopus\nsize=       1kB time=00:00:00.00 bitrate=N/A speed=   0x\nsize=     256kB time=00:00:26.51 bitrate=  79.1kbits/s speed=  53x\nsize=     512kB time=00:00:56.59 bitrate=  74.1kbits/s speed=56.6x\nsize=     768kB time=00:01:19.83 bitrate=  78.8kbits/s speed=53.2x\nsize=    1024kB time=00:01:42.19 bitrate=  82.1kbits/s speed=51.1x\nsize=    1024kB time=00:02:04.81 bitrate=  67.2kbits/s speed=49.9x\nsize=    1341kB time=00:02:09.99 bitrate=  84.5kbits/s speed=49.5x\nvideo:0kB audio:1295kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.536398%\n[download] 100% of    1.31MiB in 00:00:03 at 426.02KiB/s\n[ExtractAudio] Destination: tmp_audio.wav\nDeleting original file tmp_audio (pass -k to keep)\nCommand output: Separated tracks will be stored in /src/0d7ed4d7\nSeparating track /src/tmp_audio.wav\nNone\nLoaded audio in 7.839999999850988 seconds\nCheckpoint path: /src/filosax_25k.pth\nUsing cuda for inference.\nGPU number: 1\nSegment 0 / 4\nSegment 8 / 4\nSegment 16 / 4\nSegment 24 / 4\nSegment 32 / 4\nWrite out to tmp_audio__joel_frahm_caravan.mid\nTranscribed audio in 11.559999999590218 seconds\nTempo estimated from MIDI as 213.86374937841717 bpm\nTempo calculated as 240 bpm\n/src/monoparse -v 5 -a /src/qparse/44-red-charlieparker-omnibook.wta -barbeat 4 -beats /src/tmp_audio__joel_frahm_caravan.txt -start 0 -end 129.93 -config /src/qparse/params.ini -mono -ts 4/4 -max -clef G2 -tempo 240 -m /src/tmp_audio__joel_frahm_caravan.mid -o /src/tmp_audio__joel_frahm_caravan.mei\nCommand output: option help verbosity=5\nOption help : /src/qparse/44-red-charlieparker-omnibook.wta\noptions: setOptionArgs: 18\noptions: setOptionArgs barbeat : 1\nOption beats : /src/tmp_audio__joel_frahm_caravan.txt\noptions: setOptionArgs: 13\noptions: setOptionArgs start : 0\noptions: setOptionArgs: 14\noptions: setOptionArgs end : 129.93\nOption config : /src/qparse/params.ini\noptions: setOptionArgs: 29\noptions: setOptionArgs: 19\noptions: setOptionArgs ts : 4/4\noptions: setOptionArgs: 32\noptions: setOptionArgs: 33\noptions: setOptionArgs clef : G2\noptions: setOptionArgs: 17\noptions: setOptionArgs tempo : 240\noption input: MIDI file import\noption help : /src/tmp_audio__joel_frahm_caravan.mid\noption output : /src/tmp_audio__joel_frahm_caravan.meioption output: MEI file export\noptions: no schema file type\n[ info] verbosity level = 5\n[ info] input file: /src/tmp_audio__joel_frahm_caravan.mid\n[ info] schema file: /src/qparse/44-red-charlieparker-omnibook.wta (??? weight model option)\n[ info] beat annotation file: /src/tmp_audio__joel_frahm_caravan.txt\n[ info] output file: /src/tmp_audio__joel_frahm_caravan.mei\n[ info] config file: /src/qparse/params.ini\nloading configuration parameters from ini file /src/qparse/params.ini\nreading config from /src/qparse/params.ini\n[debug] parsing::OPT_RUN_DUR = true\n[debug] parsing::OPT_RUN_UNIT = true\n[debug] parsing::OPT_RUN_STRICT = true\n[debug] Weight::CST_ALPHA = 0.5\n[debug] Weight::CST_SIGMA2 = 0.5\n[debug] Weight::COEF_OFFSET_DIST_LEFT = 1\n[debug] Weight::COEF_OFFSET_DIST_RIGHT = 0.1\n[ info] parser multibar for drum with Key_SImono\n[ info] compute best tree sequence for /src/qparse/44-red-charlieparker-omnibook.wta and input in /src/tmp_audio__joel_frahm_caravan.mid\n[warning] -mono option implicit with model monoLR (ignored).\n[warning] null voicing in new ScoringEnv\n[ info] Debug mode ON\n[ info] Staccato mode ON\n[ info] Enumeration ordering: best to worst\n[ info] Import input from /src/tmp_audio__joel_frahm_caravan.mid\n[ info] read input segment from MIDI file /src/tmp_audio__joel_frahm_caravan.mid, mode MONO\n[ info] Input Env: read input segment from MIDI file /src/tmp_audio__joel_frahm_caravan.mid track 1, mode MONO\n[ info] MIDIfile: 1 tracks, hasJoinedTracks=1\n[ info] MIDIfile: ticks per Quarter Note=960\n[ info] MIDIfile: total time = 129.83sec = 499201ticks = 520.001qn\n[ info] reset input segment start to 0, end to 129.93\n[ info] segment : rbegin=0, rend=129.93, 1206 events  (1206 in bounds, 0 to 1206)\n[debug] segment ORIGINAL: rbegin=0, rend=129.93, 1206 events  (1206 in bounds, 0 to 1206)\n[debug] time signature = 4/4\n[ info] read beat tracking annotations from /src/tmp_audio__joel_frahm_caravan.txt\n[ info] shift beattracking content by -0\n[ info] beats : rbegin=0, rend=130.79, 131 ticks, pickup=0, 1 bpb, 1 tpb\n[ info] 131 beat tracking annotation loaded\n[ info] initialize tempo with beattracking info (initial bar dur=0.91)\n[debug] initial bar duration = 1s\n[debug] segment : rbegin=0, rend=129.93, 1206 events  (1206 in bounds, 0 to 1206)\n[debug] beattrack : rbegin=0, rend=130.79, 131 ticks, pickup=0, 1 bpb, 1 tpb\n[ info] import schema from /src/qparse/44-red-charlieparker-omnibook.wta, weight domain: Tropical\n[ info] found weight type Tropical in /src/qparse/44-red-charlieparker-omnibook.wta\n[ info] SWTA import: start importing schema from /src/qparse/44-red-charlieparker-omnibook.wta\n[ info] SWTA import: 30 transitions succesfully parsed from /src/qparse/44-red-charlieparker-omnibook.wta\n[ info] SWTA import: force weight type Tropical\n[ info] SWTAFileIn (after casting and cleaning):\n6 states\n30 transitions\n46 total symbols\n[ info] Time Signature from command line option: 4/4\n[ info] Scoring Env: change time sig to 4/4\n[warning] newRewriteRule: RestDot replaced by RestDotL, RestDotR\n[ info] 1-best computation\n[ info] parsing mono input segment [0-129.93].rdur=129.93\n[ info] input segment is set as open\n[ info] Parse and construct the symbolic score model {}\n[ info] Construct a symbolic score MonoLR model from the 1-best parse tree\n[debug] top run of weight 2559.36=<2424.13, 135.23>\n[debug] TableImporter: read part solo run=B2 tr=B2(0 , -2 ) : 0=<0, 0> : 2559.36=<2424.13, 135.23> filter= B2 (0/2) sunit:0 scont:0 [ ]_0 1206pts weight=2559.36=<2424.13, 135.23>\n[debug] TableImporter[0]: read measure 0  weight=9.54=<9.54, 0>\n[debug] TableImporter[0]: read measure 1  weight=18.334=<17.834, 0.5>\n[debug] TableImporter[5]: read measure 2  weight=10.0622=<9.945, 0.117168>\n[debug] TableImporter[7]: read measure 3  weight=25.1896=<24.68, 0.509627>\n[debug] TableImporter[17]: read measure 4  weight=5.80288=<5.41, 0.392885>\n[debug] TableImporter[18]: read measure 5  weight=9.54=<9.54, 0>\n[debug] TableImporter[18]: read measure 6  weight=21.7102=<20.71, 1.00015>\n[debug] TableImporter[29]: read measure 7  weight=22.36=<21.36, 1.00003>\n[debug] TableImporter[37]: read measure 8  weight=9.54=<9.54, 0>\n[debug] TableImporter[37]: read measure 9  weight=17.4664=<16.94, 0.526371>\n[debug] TableImporter[45]: read measure 10  weight=21.23=<19.23, 1.99996>\n[debug] TableImporter[53]: read measure 11  weight=9.945=<9.945, 4.57143e-06>\n[debug] TableImporter[55]: read measure 12  weight=24.0931=<23.34, 0.753108>\n[debug] TableImporter[61]: read measure 13  weight=19.7025=<18.53, 1.17249>\n[debug] TableImporter[69]: read measure 14  weight=23.3766=<21.51, 1.86657>\n[debug] TableImporter[77]: read measure 15  weight=9.945=<9.945, 4.0404e-06>\n[debug] TableImporter[79]: read measure 16  weight=16.4712=<15.46, 1.01119>\n[debug] TableImporter[83]: read measure 17  weight=17.25=<16.25, 1.00004>\n[debug] TableImporter[91]: read measure 18  weight=24.1233=<23.79, 0.333321>\n[debug] TableImporter[99]: read measure 19  weight=9.94501=<9.945, 5.17799e-06>\n[debug] TableImporter[101]: read measure 20  weight=16.996=<16.2, 0.795977>\n[debug] TableImporter[107]: read measure 21  weight=17.01=<15.51, 1.49998>\n[debug] TableImporter[113]: read measure 22  weight=22.718=<21.51, 1.208>\n[debug] TableImporter[121]: read measure 23  weight=16.01=<15.51, 0.500004>\n[debug] TableImporter[127]: read measure 24  weight=16.5778=<16.2, 0.377788>\n[debug] TableImporter[133]: read measure 25  weight=10.4708=<9.945, 0.525782>\n[debug] TableImporter[135]: read measure 26  weight=20.3534=<19.85, 0.503407>\n[debug] TableImporter[141]: read measure 27  weight=15.51=<15.51, 3.99445e-05>\n[debug] TableImporter[147]: read measure 28  weight=20.3096=<19.85, 0.459578>\n[debug] TableImporter[153]: read measure 29  weight=19.2301=<19.23, 6.04489e-05>\n[debug] TableImporter[161]: read measure 30  weight=9.94501=<9.945, 7.92082e-06>\n[debug] TableImporter[164]: read measure 31  weight=22.448=<21.45, 0.997964>\n[debug] TableImporter[177]: read measure 32  weight=23.1902=<22.19, 1.00017>\n[debug] TableImporter[193]: read measure 33  weight=16.9903=<16.99, 0.000293848>\n[debug] TableImporter[203]: read measure 34  weight=18.1197=<16.94, 1.17971>\n[debug] TableImporter[211]: read measure 35  weight=44.2971=<41.848, 2.44913>\n[debug] TableImporter[231]: read measure 36  weight=28.1654=<25.51, 2.65543>\n[debug] TableImporter[249]: read measure 37  weight=22.1908=<22.19, 0.000775758>\n[debug] TableImporter[265]: read measure 38  weight=29.0112=<28.83, 0.181225>\n[debug] TableImporter[285]: read measure 39  weight=21.7998=<20.26, 1.53984>\n[debug] TableImporter[296]: read measure 40  weight=17.1838=<16.9, 0.28378>\n[debug] TableImporter[299]: read measure 41  weight=21.5935=<20.31, 1.28351>\n[debug] TableImporter[311]: read measure 42  weight=22.8572=<22.19, 0.667152>\n[debug] TableImporter[327]: read measure 43  weight=27.6812=<24.77, 2.91117>\n[debug] TableImporter[343]: read measure 44  weight=16.9907=<16.99, 0.00073266>\n[debug] TableImporter[354]: read measure 45  weight=20.9702=<19.97, 1.00024>\n[debug] TableImporter[363]: read measure 46  weight=15.5102=<15.51, 0.000205122>\n[debug] TableImporter[370]: read measure 47  weight=10.5473=<9.945, 0.602347>\n[debug] TableImporter[372]: read measure 48  weight=13.7269=<13.374, 0.35291>\n[debug] TableImporter[373]: read measure 49  weight=22.8725=<21.45, 1.42251>\n[debug] TableImporter[387]: read measure 50  weight=21.224=<19.57, 1.65402>\n[debug] TableImporter[397]: read measure 51  weight=10.1043=<9.945, 0.159301>\n[debug] TableImporter[399]: read measure 52  weight=21.6638=<20.71, 0.953817>\n[debug] TableImporter[411]: read measure 53  weight=17.5182=<16.25, 1.26818>\n[debug] TableImporter[419]: read measure 54  weight=19.4961=<19.374, 0.12211>\n[debug] TableImporter[424]: read measure 55  weight=27.4835=<27.05, 0.433542>\n[debug] TableImporter[439]: read measure 56  weight=22.5247=<22.19, 0.334684>\n[debug] TableImporter[455]: read measure 57  weight=22.5249=<22.19, 0.334884>\n[debug] TableImporter[471]: read measure 58  weight=30.1704=<28.09, 2.08037>\n[debug] TableImporter[489]: read measure 59  weight=17.6199=<16.2, 1.41987>\n[debug] TableImporter[495]: read measure 60  weight=16.6952=<16.25, 0.445182>\n[debug] TableImporter[503]: read measure 61  weight=20.8705=<19.97, 0.900467>\n[debug] TableImporter[513]: read measure 62  weight=17.4187=<16.25, 1.16868>\n[debug] TableImporter[521]: read measure 63  weight=17.6205=<17.59, 0.0305107>\n[debug] TableImporter[525]: read measure 64  weight=22.6007=<21.45, 1.15075>\n[debug] TableImporter[539]: read measure 65  weight=26.6997=<25.57, 1.12967>\n[debug] TableImporter[551]: read measure 66  weight=17.9744=<16.99, 0.98441>\n[debug] TableImporter[561]: read measure 67  weight=5.43105=<5.41, 0.021045>\n[debug] TableImporter[562]: read measure 68  weight=23.5074=<22.84, 0.667376>\n[debug] TableImporter[573]: read measure 69  weight=23.8027=<22.07, 1.73274>\n[debug] TableImporter[586]: read measure 70  weight=9.54=<9.54, 0>\n[debug] TableImporter[586]: read measure 71  weight=10.4005=<9.945, 0.455478>\n[debug] TableImporter[587]: read measure 72  weight=29.2866=<25.51, 3.77658>\n[debug] TableImporter[605]: read measure 73  weight=23.5237=<22.19, 1.33367>\n[debug] TableImporter[621]: read measure 74  weight=22.8585=<22.19, 0.6685>\n[debug] TableImporter[637]: read measure 75  weight=15.4847=<15.46, 0.0247364>\n[debug] TableImporter[642]: read measure 76  weight=28.0346=<26.31, 1.72462>\n[debug] TableImporter[655]: read measure 77  weight=27.0817=<24.77, 2.31168>\n[debug] TableImporter[671]: read measure 78  weight=24.4518=<22.19, 2.26181>\n[debug] TableImporter[687]: read measure 79  weight=12.4567=<11.79, 0.666736>\n[debug] TableImporter[691]: read measure 80  weight=5.41=<5.41, 4.36516e-06>\n[debug] TableImporter[692]: read measure 81  weight=9.54=<9.54, 0>\n[debug] TableImporter[692]: read measure 82  weight=22.3679=<20.96, 1.40789>\n[debug] TableImporter[699]: read measure 83  weight=21.7895=<19.97, 1.81947>\n[debug] TableImporter[710]: read measure 84  weight=22.5268=<19.97, 2.55677>\n[debug] TableImporter[720]: read measure 85  weight=22.4901=<20.71, 1.78014>\n[debug] TableImporter[731]: read measure 86  weight=21.2845=<19.23, 2.05446>\n[debug] TableImporter[739]: read measure 87  weight=20.9191=<19.97, 0.949086>\n[debug] TableImporter[749]: read measure 88  weight=18.6346=<16.25, 2.38461>\n[debug] TableImporter[757]: read measure 89  weight=20.304=<19.97, 0.334009>\n[debug] TableImporter[767]: read measure 90  weight=21.3943=<20.71, 0.684286>\n[debug] TableImporter[779]: read measure 91  weight=17.0672=<16.2, 0.867236>\n[debug] TableImporter[786]: read measure 92  weight=22.5262=<22.19, 0.336172>\n[debug] TableImporter[801]: read measure 93  weight=24.2283=<22.19, 2.0383>\n[debug] TableImporter[817]: read measure 94  weight=17.3932=<16.2, 1.19317>\n[debug] TableImporter[824]: read measure 95  weight=17.3941=<16.87, 0.524135>\n[debug] TableImporter[829]: read measure 96  weight=24.6292=<22.19, 2.43916>\n[debug] TableImporter[845]: read measure 97  weight=23.798=<22.19, 1.60804>\n[debug] TableImporter[861]: read measure 98  weight=28.8893=<25.51, 3.37928>\n[debug] TableImporter[879]: read measure 99  weight=22.4125=<20.59, 1.82255>\n[debug] TableImporter[887]: read measure 100  weight=29.238=<28.83, 0.40804>\n[debug] TableImporter[907]: read measure 101  weight=39.0704=<37.046, 2.02441>\n[debug] TableImporter[925]: read measure 102  weight=31.7702=<28.83, 2.94025>\n[debug] TableImporter[945]: read measure 103  weight=24.8557=<22.19, 2.66574>\n[debug] TableImporter[961]: read measure 104  weight=22.4106=<20.59, 1.82059>\n[debug] TableImporter[970]: read measure 105  weight=22.0432=<20.71, 1.33323>\n[debug] TableImporter[981]: read measure 106  weight=22.6357=<20.96, 1.67565>\n[debug] TableImporter[989]: read measure 107  weight=27.1042=<22.19, 4.91418>\n[debug] TableImporter[1005]: read measure 108  weight=17.6082=<16.25, 1.35819>\n[debug] TableImporter[1014]: read measure 109  weight=16.4401=<15.51, 0.93009>\n[debug] TableImporter[1019]: read measure 110  weight=5.42323=<5.41, 0.0132305>\n[debug] TableImporter[1020]: read measure 111  weight=19.9055=<18.574, 1.33152>\n[debug] TableImporter[1027]: read measure 112  weight=22.4521=<21.45, 1.00213>\n[debug] TableImporter[1041]: read measure 113  weight=22.5272=<22.19, 0.337182>\n[debug] TableImporter[1057]: read measure 114  weight=22.6934=<22.19, 0.5034>\n[debug] TableImporter[1073]: read measure 115  weight=23.0253=<22.87, 0.155261>\n[debug] TableImporter[1081]: read measure 116  weight=24.3563=<22.19, 2.1663>\n[debug] TableImporter[1097]: read measure 117  weight=23.1919=<22.19, 1.00193>\n[debug] TableImporter[1113]: read measure 118  weight=24.3574=<22.19, 2.16737>\n[debug] TableImporter[1129]: read measure 119  weight=44.0163=<40.24, 3.77627>\n[debug] TableImporter[1145]: read measure 120  weight=5.50208=<5.41, 0.0920794>\n[debug] TableImporter[1146]: read measure 121  weight=17.0797=<16.25, 0.829682>\n[debug] TableImporter[1153]: read measure 122  weight=6.42534=<5.41, 1.01534>\n[debug] TableImporter[1154]: read measure 123  weight=15.291=<11.79, 3.50095>\n[debug] TableImporter[1158]: read measure 124  weight=30.0141=<29.676, 0.338061>\n[debug] TableImporter[1173]: read measure 125  weight=22.8606=<22.19, 0.670627>\n[debug] TableImporter[1189]: read measure 126  weight=21.6561=<20.71, 0.946125>\n[debug] TableImporter[1201]: read measure 127  weight=17.4294=<17.094, 0.335428>\n[ info] time to parse and build score: 2144.76ms\n[ info] Keyfind: estimation = 8 Major, sig=-4\n[ info] Key Estimation part solo in score /src/tmp_audio__joel_frahm_carava: 4b\n[ info] Pitch Spelling part solo in score /src/tmp_audio__joel_frahm_carava\n[debug] pitch-spelling with 4b\n[debug] spell0 -4\n[ info] export to MEI file /src/tmp_audio__joel_frahm_caravan.mei\n[ info] write score to /src/tmp_audio__joel_frahm_caravan.mei\n[ info] delete Parse Table\nqparse ran successfully",
  "metrics": {
    "predict_time": 146.099969,
    "total_time": 344.565616
  },
  "output": [
    "https://storage.googleapis.com/replicate-files/18VVXroRWJIOD9IyHfeYNSrfj63iYC7lEEGnbU9AUw9BVr8kA/tmp_audio__joel_frahm_caravan.mid",
    "https://storage.googleapis.com/replicate-files/flYM9epm2BgGXkxrwqG8el8RIZRg9PwVmnOiEoqjVVEBVr8kA/tmp_audio__joel_frahm_caravan.xml",
    "https://storage.googleapis.com/replicate-files/QU6yfnfOxjk0pEyAAd4FQffYPD9sW2ALe9goQevogERLoalnE/tmp_audio__joel_frahm_caravan.json"
  ],
  "started_at": "2024-03-09T00:06:38.585743Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/nfp3mtzbbmwc6ogd4cknezw2e4",
    "cancel": "https://api.replicate.com/v1/predictions/nfp3mtzbbmwc6ogd4cknezw2e4/cancel"
  },
  "version": "a279cee215f744e0925d942a24dbf873f75b242dd0d3a8bab3f54b888d3cdfe6"
}

Generated in

2 minutes 26 seconds

Tweak it Report View full prediction

[youtube] Extracting URL: https://www.youtube.com/watch?v=GKGpyi-R-SM
[youtube] GKGpyi-R-SM: Downloading webpage
[youtube] GKGpyi-R-SM: Downloading ios player API JSON
[youtube] GKGpyi-R-SM: Downloading android player API JSON
WARNING: [youtube] YouTube said: ERROR - Precondition check failed.
WARNING: [youtube] HTTP Error 400: Bad Request. Retrying (1/3)...
[youtube] GKGpyi-R-SM: Downloading android player API JSON
[youtube] GKGpyi-R-SM: Downloading m3u8 information
[info] GKGpyi-R-SM: Downloading 1 format(s): 251
[info] GKGpyi-R-SM: Downloading 1 time ranges: 120.0-250.0
[download] Destination: tmp_audio
Input #0, matroska,webm, from 'https://rr5---sn-qxo7rn7k.googlevideo.com/videoplayback?expire=1709964400&ei=EKjrZeu7DMSglu8PruSNyAM&ip=34.170.22.180&id=o-ADSBbuzV-F5UJ3jH11RnkjrbOqzwDfAwB4AJnaOA7Mdg&itag=251&source=youtube&requiressl=yes&xpc=EgVo2aDSNQ%3D%3D&mh=8A&mm=31%2C26&mn=sn-qxo7rn7k%2Csn-a5msenl7&ms=au%2Conr&mv=m&mvi=5&pl=17&initcwndbps=590000&spc=UWF9f6ZozetGKTFpB9BAqcHEGViLWwIY1y9hMYpttQSP-z8&vprv=1&svpuc=1&mime=audio%2Fwebm&gir=yes&clen=6329414&dur=472.721&lmt=1496516991838821&mt=1709942475&fvip=1&keepalive=yes&fexp=24007246&c=ANDROID&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cxpc%2Cspc%2Cvprv%2Csvpuc%2Cmime%2Cgir%2Cclen%2Cdur%2Clmt&sig=AJfQdSswRgIhANPFgsl5nBUzhC-PW8ZO5GZnzsK9l7L40VjFp9N7PCYyAiEAoHgVLGAvAT2smIYWAKcRrSDXvo4CvqXjFX3FdHMqfFU%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=APTiJQcwRgIhALhNObAJeK5TJNP0GeAX6yUr9yxg8kKn_nQycldxfN9RAiEApl3529WJmfeiMtCLDG0dP2hSUc9YT1brGV73cfY45J0%3D':
Metadata:
encoder         : google
Duration: 00:07:52.72, start: -0.007000, bitrate: 107 kb/s
Stream #0:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
Stream mapping:
Stream #0:0 -> #0:0 (opus (native) -> opus (libopus))
Press [q] to stop, [?] for help
[libopus @ 0x5c15cb9f7340] No bit rate set. Defaulting to 96000 bps.
Output #0, webm, to 'file:tmp_audio.part':
Metadata:
encoder         : Lavf58.76.100
Stream #0:0(eng): Audio: opus, 48000 Hz, stereo, flt, 96 kb/s (default)
Metadata:
encoder         : Lavc58.134.100 libopus
size=       1kB time=00:00:00.00 bitrate=N/A speed=   0x
size=     256kB time=00:00:26.51 bitrate=  79.1kbits/s speed=  53x
size=     512kB time=00:00:56.59 bitrate=  74.1kbits/s speed=56.6x
size=     768kB time=00:01:19.83 bitrate=  78.8kbits/s speed=53.2x
size=    1024kB time=00:01:42.19 bitrate=  82.1kbits/s speed=51.1x
size=    1024kB time=00:02:04.81 bitrate=  67.2kbits/s speed=49.9x
size=    1341kB time=00:02:09.99 bitrate=  84.5kbits/s speed=49.5x
video:0kB audio:1295kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.536398%
[download] 100% of    1.31MiB in 00:00:03 at 426.02KiB/s
[ExtractAudio] Destination: tmp_audio.wav
Deleting original file tmp_audio (pass -k to keep)
Command output: Separated tracks will be stored in /src/0d7ed4d7
Separating track /src/tmp_audio.wav
None
Loaded audio in 7.839999999850988 seconds
Checkpoint path: /src/filosax_25k.pth
Using cuda for inference.
GPU number: 1
Segment 0 / 4
Segment 8 / 4
Segment 16 / 4
Segment 24 / 4
Segment 32 / 4
Write out to tmp_audio__joel_frahm_caravan.mid
Transcribed audio in 11.559999999590218 seconds
Tempo estimated from MIDI as 213.86374937841717 bpm
Tempo calculated as 240 bpm
/src/monoparse -v 5 -a /src/qparse/44-red-charlieparker-omnibook.wta -barbeat 4 -beats /src/tmp_audio__joel_frahm_caravan.txt -start 0 -end 129.93 -config /src/qparse/params.ini -mono -ts 4/4 -max -clef G2 -tempo 240 -m /src/tmp_audio__joel_frahm_caravan.mid -o /src/tmp_audio__joel_frahm_caravan.mei
Command output: option help verbosity=5
Option help : /src/qparse/44-red-charlieparker-omnibook.wta
options: setOptionArgs: 18
options: setOptionArgs barbeat : 1
Option beats : /src/tmp_audio__joel_frahm_caravan.txt
options: setOptionArgs: 13
options: setOptionArgs start : 0
options: setOptionArgs: 14
options: setOptionArgs end : 129.93
Option config : /src/qparse/params.ini
options: setOptionArgs: 29
options: setOptionArgs: 19
options: setOptionArgs ts : 4/4
options: setOptionArgs: 32
options: setOptionArgs: 33
options: setOptionArgs clef : G2
options: setOptionArgs: 17
options: setOptionArgs tempo : 240
option input: MIDI file import
option help : /src/tmp_audio__joel_frahm_caravan.mid
option output : /src/tmp_audio__joel_frahm_caravan.meioption output: MEI file export
options: no schema file type
[ info] verbosity level = 5
[ info] input file: /src/tmp_audio__joel_frahm_caravan.mid
[ info] schema file: /src/qparse/44-red-charlieparker-omnibook.wta (??? weight model option)
[ info] beat annotation file: /src/tmp_audio__joel_frahm_caravan.txt
[ info] output file: /src/tmp_audio__joel_frahm_caravan.mei
[ info] config file: /src/qparse/params.ini
loading configuration parameters from ini file /src/qparse/params.ini
reading config from /src/qparse/params.ini
[debug] parsing::OPT_RUN_DUR = true
[debug] parsing::OPT_RUN_UNIT = true
[debug] parsing::OPT_RUN_STRICT = true
[debug] Weight::CST_ALPHA = 0.5
[debug] Weight::CST_SIGMA2 = 0.5
[debug] Weight::COEF_OFFSET_DIST_LEFT = 1
[debug] Weight::COEF_OFFSET_DIST_RIGHT = 0.1
[ info] parser multibar for drum with Key_SImono
[ info] compute best tree sequence for /src/qparse/44-red-charlieparker-omnibook.wta and input in /src/tmp_audio__joel_frahm_caravan.mid
[warning] -mono option implicit with model monoLR (ignored).
[warning] null voicing in new ScoringEnv
[ info] Debug mode ON
[ info] Staccato mode ON
[ info] Enumeration ordering: best to worst
[ info] Import input from /src/tmp_audio__joel_frahm_caravan.mid
[ info] read input segment from MIDI file /src/tmp_audio__joel_frahm_caravan.mid, mode MONO
[ info] Input Env: read input segment from MIDI file /src/tmp_audio__joel_frahm_caravan.mid track 1, mode MONO
[ info] MIDIfile: 1 tracks, hasJoinedTracks=1
[ info] MIDIfile: ticks per Quarter Note=960
[ info] MIDIfile: total time = 129.83sec = 499201ticks = 520.001qn
[ info] reset input segment start to 0, end to 129.93
[ info] segment : rbegin=0, rend=129.93, 1206 events  (1206 in bounds, 0 to 1206)
[debug] segment ORIGINAL: rbegin=0, rend=129.93, 1206 events  (1206 in bounds, 0 to 1206)
[debug] time signature = 4/4
[ info] read beat tracking annotations from /src/tmp_audio__joel_frahm_caravan.txt
[ info] shift beattracking content by -0
[ info] beats : rbegin=0, rend=130.79, 131 ticks, pickup=0, 1 bpb, 1 tpb
[ info] 131 beat tracking annotation loaded
[ info] initialize tempo with beattracking info (initial bar dur=0.91)
[debug] initial bar duration = 1s
[debug] segment : rbegin=0, rend=129.93, 1206 events  (1206 in bounds, 0 to 1206)
[debug] beattrack : rbegin=0, rend=130.79, 131 ticks, pickup=0, 1 bpb, 1 tpb
[ info] import schema from /src/qparse/44-red-charlieparker-omnibook.wta, weight domain: Tropical
[ info] found weight type Tropical in /src/qparse/44-red-charlieparker-omnibook.wta
[ info] SWTA import: start importing schema from /src/qparse/44-red-charlieparker-omnibook.wta
[ info] SWTA import: 30 transitions succesfully parsed from /src/qparse/44-red-charlieparker-omnibook.wta
[ info] SWTA import: force weight type Tropical
[ info] SWTAFileIn (after casting and cleaning):
6 states
30 transitions
46 total symbols
[ info] Time Signature from command line option: 4/4
[ info] Scoring Env: change time sig to 4/4
[warning] newRewriteRule: RestDot replaced by RestDotL, RestDotR
[ info] 1-best computation
[ info] parsing mono input segment [0-129.93].rdur=129.93
[ info] input segment is set as open
[ info] Parse and construct the symbolic score model {}
[ info] Construct a symbolic score MonoLR model from the 1-best parse tree
[debug] top run of weight 2559.36=<2424.13, 135.23>
[debug] TableImporter: read part solo run=B2 tr=B2(0 , -2 ) : 0=<0, 0> : 2559.36=<2424.13, 135.23> filter= B2 (0/2) sunit:0 scont:0 [ ]_0 1206pts weight=2559.36=<2424.13, 135.23>
[debug] TableImporter[0]: read measure 0  weight=9.54=<9.54, 0>
[debug] TableImporter[0]: read measure 1  weight=18.334=<17.834, 0.5>
[debug] TableImporter[5]: read measure 2  weight=10.0622=<9.945, 0.117168>
[debug] TableImporter[7]: read measure 3  weight=25.1896=<24.68, 0.509627>
[debug] TableImporter[17]: read measure 4  weight=5.80288=<5.41, 0.392885>
[debug] TableImporter[18]: read measure 5  weight=9.54=<9.54, 0>
[debug] TableImporter[18]: read measure 6  weight=21.7102=<20.71, 1.00015>
[debug] TableImporter[29]: read measure 7  weight=22.36=<21.36, 1.00003>
[debug] TableImporter[37]: read measure 8  weight=9.54=<9.54, 0>
[debug] TableImporter[37]: read measure 9  weight=17.4664=<16.94, 0.526371>
[debug] TableImporter[45]: read measure 10  weight=21.23=<19.23, 1.99996>
[debug] TableImporter[53]: read measure 11  weight=9.945=<9.945, 4.57143e-06>
[debug] TableImporter[55]: read measure 12  weight=24.0931=<23.34, 0.753108>
[debug] TableImporter[61]: read measure 13  weight=19.7025=<18.53, 1.17249>
[debug] TableImporter[69]: read measure 14  weight=23.3766=<21.51, 1.86657>
[debug] TableImporter[77]: read measure 15  weight=9.945=<9.945, 4.0404e-06>
[debug] TableImporter[79]: read measure 16  weight=16.4712=<15.46, 1.01119>
[debug] TableImporter[83]: read measure 17  weight=17.25=<16.25, 1.00004>
[debug] TableImporter[91]: read measure 18  weight=24.1233=<23.79, 0.333321>
[debug] TableImporter[99]: read measure 19  weight=9.94501=<9.945, 5.17799e-06>
[debug] TableImporter[101]: read measure 20  weight=16.996=<16.2, 0.795977>
[debug] TableImporter[107]: read measure 21  weight=17.01=<15.51, 1.49998>
[debug] TableImporter[113]: read measure 22  weight=22.718=<21.51, 1.208>
[debug] TableImporter[121]: read measure 23  weight=16.01=<15.51, 0.500004>
[debug] TableImporter[127]: read measure 24  weight=16.5778=<16.2, 0.377788>
[debug] TableImporter[133]: read measure 25  weight=10.4708=<9.945, 0.525782>
[debug] TableImporter[135]: read measure 26  weight=20.3534=<19.85, 0.503407>
[debug] TableImporter[141]: read measure 27  weight=15.51=<15.51, 3.99445e-05>
[debug] TableImporter[147]: read measure 28  weight=20.3096=<19.85, 0.459578>
[debug] TableImporter[153]: read measure 29  weight=19.2301=<19.23, 6.04489e-05>
[debug] TableImporter[161]: read measure 30  weight=9.94501=<9.945, 7.92082e-06>
[debug] TableImporter[164]: read measure 31  weight=22.448=<21.45, 0.997964>
[debug] TableImporter[177]: read measure 32  weight=23.1902=<22.19, 1.00017>
[debug] TableImporter[193]: read measure 33  weight=16.9903=<16.99, 0.000293848>
[debug] TableImporter[203]: read measure 34  weight=18.1197=<16.94, 1.17971>
[debug] TableImporter[211]: read measure 35  weight=44.2971=<41.848, 2.44913>
[debug] TableImporter[231]: read measure 36  weight=28.1654=<25.51, 2.65543>
[debug] TableImporter[249]: read measure 37  weight=22.1908=<22.19, 0.000775758>
[debug] TableImporter[265]: read measure 38  weight=29.0112=<28.83, 0.181225>
[debug] TableImporter[285]: read measure 39  weight=21.7998=<20.26, 1.53984>
[debug] TableImporter[296]: read measure 40  weight=17.1838=<16.9, 0.28378>
[debug] TableImporter[299]: read measure 41  weight=21.5935=<20.31, 1.28351>
[debug] TableImporter[311]: read measure 42  weight=22.8572=<22.19, 0.667152>
[debug] TableImporter[327]: read measure 43  weight=27.6812=<24.77, 2.91117>
[debug] TableImporter[343]: read measure 44  weight=16.9907=<16.99, 0.00073266>
[debug] TableImporter[354]: read measure 45  weight=20.9702=<19.97, 1.00024>
[debug] TableImporter[363]: read measure 46  weight=15.5102=<15.51, 0.000205122>
[debug] TableImporter[370]: read measure 47  weight=10.5473=<9.945, 0.602347>
[debug] TableImporter[372]: read measure 48  weight=13.7269=<13.374, 0.35291>
[debug] TableImporter[373]: read measure 49  weight=22.8725=<21.45, 1.42251>
[debug] TableImporter[387]: read measure 50  weight=21.224=<19.57, 1.65402>
[debug] TableImporter[397]: read measure 51  weight=10.1043=<9.945, 0.159301>
[debug] TableImporter[399]: read measure 52  weight=21.6638=<20.71, 0.953817>
[debug] TableImporter[411]: read measure 53  weight=17.5182=<16.25, 1.26818>
[debug] TableImporter[419]: read measure 54  weight=19.4961=<19.374, 0.12211>
[debug] TableImporter[424]: read measure 55  weight=27.4835=<27.05, 0.433542>
[debug] TableImporter[439]: read measure 56  weight=22.5247=<22.19, 0.334684>
[debug] TableImporter[455]: read measure 57  weight=22.5249=<22.19, 0.334884>
[debug] TableImporter[471]: read measure 58  weight=30.1704=<28.09, 2.08037>
[debug] TableImporter[489]: read measure 59  weight=17.6199=<16.2, 1.41987>
[debug] TableImporter[495]: read measure 60  weight=16.6952=<16.25, 0.445182>
[debug] TableImporter[503]: read measure 61  weight=20.8705=<19.97, 0.900467>
[debug] TableImporter[513]: read measure 62  weight=17.4187=<16.25, 1.16868>
[debug] TableImporter[521]: read measure 63  weight=17.6205=<17.59, 0.0305107>
[debug] TableImporter[525]: read measure 64  weight=22.6007=<21.45, 1.15075>
[debug] TableImporter[539]: read measure 65  weight=26.6997=<25.57, 1.12967>
[debug] TableImporter[551]: read measure 66  weight=17.9744=<16.99, 0.98441>
[debug] TableImporter[561]: read measure 67  weight=5.43105=<5.41, 0.021045>
[debug] TableImporter[562]: read measure 68  weight=23.5074=<22.84, 0.667376>
[debug] TableImporter[573]: read measure 69  weight=23.8027=<22.07, 1.73274>
[debug] TableImporter[586]: read measure 70  weight=9.54=<9.54, 0>
[debug] TableImporter[586]: read measure 71  weight=10.4005=<9.945, 0.455478>
[debug] TableImporter[587]: read measure 72  weight=29.2866=<25.51, 3.77658>
[debug] TableImporter[605]: read measure 73  weight=23.5237=<22.19, 1.33367>
[debug] TableImporter[621]: read measure 74  weight=22.8585=<22.19, 0.6685>
[debug] TableImporter[637]: read measure 75  weight=15.4847=<15.46, 0.0247364>
[debug] TableImporter[642]: read measure 76  weight=28.0346=<26.31, 1.72462>
[debug] TableImporter[655]: read measure 77  weight=27.0817=<24.77, 2.31168>
[debug] TableImporter[671]: read measure 78  weight=24.4518=<22.19, 2.26181>
[debug] TableImporter[687]: read measure 79  weight=12.4567=<11.79, 0.666736>
[debug] TableImporter[691]: read measure 80  weight=5.41=<5.41, 4.36516e-06>
[debug] TableImporter[692]: read measure 81  weight=9.54=<9.54, 0>
[debug] TableImporter[692]: read measure 82  weight=22.3679=<20.96, 1.40789>
[debug] TableImporter[699]: read measure 83  weight=21.7895=<19.97, 1.81947>
[debug] TableImporter[710]: read measure 84  weight=22.5268=<19.97, 2.55677>
[debug] TableImporter[720]: read measure 85  weight=22.4901=<20.71, 1.78014>
[debug] TableImporter[731]: read measure 86  weight=21.2845=<19.23, 2.05446>
[debug] TableImporter[739]: read measure 87  weight=20.9191=<19.97, 0.949086>
[debug] TableImporter[749]: read measure 88  weight=18.6346=<16.25, 2.38461>
[debug] TableImporter[757]: read measure 89  weight=20.304=<19.97, 0.334009>
[debug] TableImporter[767]: read measure 90  weight=21.3943=<20.71, 0.684286>
[debug] TableImporter[779]: read measure 91  weight=17.0672=<16.2, 0.867236>
[debug] TableImporter[786]: read measure 92  weight=22.5262=<22.19, 0.336172>
[debug] TableImporter[801]: read measure 93  weight=24.2283=<22.19, 2.0383>
[debug] TableImporter[817]: read measure 94  weight=17.3932=<16.2, 1.19317>
[debug] TableImporter[824]: read measure 95  weight=17.3941=<16.87, 0.524135>
[debug] TableImporter[829]: read measure 96  weight=24.6292=<22.19, 2.43916>
[debug] TableImporter[845]: read measure 97  weight=23.798=<22.19, 1.60804>
[debug] TableImporter[861]: read measure 98  weight=28.8893=<25.51, 3.37928>
[debug] TableImporter[879]: read measure 99  weight=22.4125=<20.59, 1.82255>
[debug] TableImporter[887]: read measure 100  weight=29.238=<28.83, 0.40804>
[debug] TableImporter[907]: read measure 101  weight=39.0704=<37.046, 2.02441>
[debug] TableImporter[925]: read measure 102  weight=31.7702=<28.83, 2.94025>
[debug] TableImporter[945]: read measure 103  weight=24.8557=<22.19, 2.66574>
[debug] TableImporter[961]: read measure 104  weight=22.4106=<20.59, 1.82059>
[debug] TableImporter[970]: read measure 105  weight=22.0432=<20.71, 1.33323>
[debug] TableImporter[981]: read measure 106  weight=22.6357=<20.96, 1.67565>
[debug] TableImporter[989]: read measure 107  weight=27.1042=<22.19, 4.91418>
[debug] TableImporter[1005]: read measure 108  weight=17.6082=<16.25, 1.35819>
[debug] TableImporter[1014]: read measure 109  weight=16.4401=<15.51, 0.93009>
[debug] TableImporter[1019]: read measure 110  weight=5.42323=<5.41, 0.0132305>
[debug] TableImporter[1020]: read measure 111  weight=19.9055=<18.574, 1.33152>
[debug] TableImporter[1027]: read measure 112  weight=22.4521=<21.45, 1.00213>
[debug] TableImporter[1041]: read measure 113  weight=22.5272=<22.19, 0.337182>
[debug] TableImporter[1057]: read measure 114  weight=22.6934=<22.19, 0.5034>
[debug] TableImporter[1073]: read measure 115  weight=23.0253=<22.87, 0.155261>
[debug] TableImporter[1081]: read measure 116  weight=24.3563=<22.19, 2.1663>
[debug] TableImporter[1097]: read measure 117  weight=23.1919=<22.19, 1.00193>
[debug] TableImporter[1113]: read measure 118  weight=24.3574=<22.19, 2.16737>
[debug] TableImporter[1129]: read measure 119  weight=44.0163=<40.24, 3.77627>
[debug] TableImporter[1145]: read measure 120  weight=5.50208=<5.41, 0.0920794>
[debug] TableImporter[1146]: read measure 121  weight=17.0797=<16.25, 0.829682>
[debug] TableImporter[1153]: read measure 122  weight=6.42534=<5.41, 1.01534>
[debug] TableImporter[1154]: read measure 123  weight=15.291=<11.79, 3.50095>
[debug] TableImporter[1158]: read measure 124  weight=30.0141=<29.676, 0.338061>
[debug] TableImporter[1173]: read measure 125  weight=22.8606=<22.19, 0.670627>
[debug] TableImporter[1189]: read measure 126  weight=21.6561=<20.71, 0.946125>
[debug] TableImporter[1201]: read measure 127  weight=17.4294=<17.094, 0.335428>
[ info] time to parse and build score: 2144.76ms
[ info] Keyfind: estimation = 8 Major, sig=-4
[ info] Key Estimation part solo in score /src/tmp_audio__joel_frahm_carava: 4b
[ info] Pitch Spelling part solo in score /src/tmp_audio__joel_frahm_carava
[debug] pitch-spelling with 4b
[debug] spell0 -4
[ info] export to MEI file /src/tmp_audio__joel_frahm_caravan.mei
[ info] write score to /src/tmp_audio__joel_frahm_caravan.mei
[ info] delete Parse Table
qparse ran successfully

This output was created using a different version of the model, xavriley/sax_transcription:a279cee2.

Run time and cost

This model runs on Nvidia T4 GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This model accompanies the paper “Reconstructing the Charlie Parker Omnibook using an audio-to-score automatic transcription pipeline” (currently under review).

The model takes either

an audio file

a YouTube url

You can optionally specify a start time and finish time for use with YouTube videos.

The model extracts the saxophone audio, transcribes it to MIDI and then converts the MIDI to sheet music. It returns a MusicXML file which you can import into any sheet music program. It also returns a json file containing syncpoints for use with Soundslice.

More details to follow!