Mustango: Toward Controllable Text-to-Music Generation
Meet Mustango, an exciting addition to the vibrant landscape of Multimodal Large Language Models designed for controlled music generation. Mustango leverages Latent Diffusion Model (LDM), Flan-T5, and musical features to do the magic!
Citation
Please consider citing the following article if you found our work useful:
@misc{melechovsky2023mustango,
title={Mustango: Toward Controllable Text-to-Music Generation},
author={Jan Melechovsky and Zixun Guo and Deepanway Ghosal and Navonil Majumder and Dorien Herremans and Soujanya Poria},
year={2023},
eprint={2311.08355},
archivePrefix={arXiv},
}