jimothyjohn / demixing

Separate instruments and/or vocals from any song.

  • Public
  • 211 runs
  • GitHub
  • Paper
  • License

Input

Output

Run time and cost

This model runs on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 82 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Usage

Choose any or all of the instruments in a track to isolate and remove the instrument from the track. Outputs the isolated instrument track and the rest of the audio in a merged track.

Expected runtime (after startup): 1 minute.

Algorithm

Demucs is a state-of-the-art music source separation model, currently capable of separating drums, bass, piano, guitar, and vocals from the rest of the accompaniment. Demucs is based on a U-Net convolutional architecture inspired by Wave-U-Net. The v4 version features Hybrid Transformer Demucs, a hybrid spectrogram/waveform separation model using Transformers. It is based on Hybrid Demucs (also provided in this repo), with the innermost layers replaced by a cross-domain Transformer Encoder. This Transformer uses self-attention within each domain, and cross-attention across domains. The model achieves a SDR of 9.00 dB on the MUSDB HQ test set.

Sample track

Song - Cobie Sample Artist - JBlanked Source - Free Music Archive License - CC BY-NC-ND