Join us at Uncanny Spaces, a series of talks about ML and creativity. 🚀

retrocirce/zero_shot_audio_source_separation

Public
Zero shot Sound separation by arbitrary query samples
1,395 runs

Performance

This model runs predictions on Nvidia T4 GPU hardware.

80% of predictions complete within 5 minutes. The predict time for this model varies significantly based on the inputs.

Readme

A demo for the official github repository for the paper Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022.
short instroduction video
full presentation video.
Authors website

This model allows you to separate any source from a sound track.
For example if you have a jazz song with a clarinet track in it you can extract the clarient showing the model a clarinet sound sample.

The inputs are a mixture audio to separate, and a given source sample as a query.
The output will be the extracted source track from the mixture.

Citing

@inproceedings{zsasp-ke2022,
  author = {Ke Chen* and Xingjian Du* and Bilei Zhu and Zejun Ma and Taylor Berg-Kirkpatrick and Shlomo Dubnov},
  title = {Zero-shot Audio Source Separation via Query-based Learning from Weakly-labeled Data},
  booktitle = {{AAAI} 2022}
}

@inproceedings{htsat-ke2022,
  author = {Ke Chen and Xingjian Du and Bilei Zhu and Zejun Ma and Taylor Berg-Kirkpatrick and Shlomo Dubnov},
  title = {HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection},
  booktitle = {{ICASSP} 2022}
}

Replicate