pphu/musicgen-small | Readme and Docs

Forked from https://replicate.com/facebookresearch/musicgen which only supported musicgen-large and Melody model at the time of forking.

Model Architecture and Development MusicGen is single stage auto-regressive Transformer model trained over a 32kHz EnCodec tokenizer with 4 codebooks sampled at 50 Hz. Unlike existing methods like MusicLM, MusicGen doesn’t require a self-supervised semantic representation, and it generates all 4 codebooks in one pass. By introducing a small delay between the codebooks, the authors show they can predict them in parallel, thus having only 50 auto-regressive steps per second of audio. They used 20K hours of licensed music to train MusicGen. Specifically, they relied on an internal dataset of 10K high-quality music tracks, and on the ShutterStock and Pond5 music data.

Licenses All code in this repository is licensed under the Apache License 2.0 license. The code in the Audiocraft repository is released under the MIT license as found in the LICENSE file. The weights in the Audiocraft repository are released under the CC-BY-NC 4.0 license as found in the LICENSE_weights file.

Model created over 1 year ago