Generate music
These models generate and modify music from text prompts and raw audio. They combine large language models and diffusion models trained on text-music pairs to understand musical concepts.
Key capabilities:
- Music generation: Create original music compositions and continuations based on text prompts. Generate realistic music matching a description.
- Audio super-resolution: Increase sample rates and add high frequency detail to improve the fidelity of generated or existing audio.
- Controllable generation: Specify parameters like chords, instruments, tempo, and style to control the generated music.
Our Pick: Riffusion
For most users, we recommend Riffusion as the best general-purpose music generation model. It generates high-quality music based on text prompts in real-time, usually in around 10 seconds.
Riffusion uses a latent diffusion model to generate a mel spectrogram (an audio representation) which is then converted into realistic audio. This allows it to create music matching a description extremely quickly.
To get the most out of generated audio, we also recommend running it through an audio super-resolution model like nateraw/audio-super-resolution. This will increase the sample rate and improve the overall fidelity in about 45 seconds.
Also Great: MusicGen
If you want more control over your generations, the MusicGen family of models are great options. In particular, we recommend:
- musicgen-remixer for remixing an existing song into a new style
- musicgen-chord for specifying exact chords and tempo
- musicgen-stereo-chord for stereo output with chords and tempo control
These models give you much more fine-grained control, at the cost of longer generation times (3-5 minutes). They’re great for musicians and composers who want to dial in specific parameters.
Other Alternatives
A few other models enable interesting niche capabilities:
- EMOPIA generates music conditioned on a desired emotion
- Mustango provides extra control tags for audio quality, duration, etc.
- Cantable Diffuguesion generates and harmonizes Bach chorales
Recommended models
![](https://tjzk.replicate.delivery/models_models_featured_image/a921a8b3-3e9e-48ef-995c-29143ea11bec/musicgen.jpeg)
meta/musicgen
Generate music from a prompt or melody
![](https://tjzk.replicate.delivery/models_models_featured_image/4154e53a-5c5d-4ac5-9da8-62a1fec212bf/riffusion.gif)
riffusion/riffusion
Stable diffusion for real-time music generation
![](https://tjzk.replicate.delivery/models_models_featured_image/2badd629-5ca3-4976-b7cc-7853e1153b9b/Screen_Shot_2022-07-05_at_18.4.png)
allenhung1025/looptest
Four-bar drum loop generation
![](https://tjzk.replicate.delivery/models_models_cover_image/6bc4e480-6695-451b-940d-48a3b83a1356/replicate-prediction-jvi5xvlbg4v4.png)
nateraw/audio-super-resolution
AudioSR: Versatile Audio Super-resolution at Scale
![](https://tjzk.replicate.delivery/models_models_cover_image/17f584de-98ae-489c-aea8-fdf366858ad6/640px-Spectrogram-19thC.png)
haoheliu/audio-ldm
Text-to-audio generation with latent diffusion models
![](https://replicate.delivery/mgxm/fd0dd91d-939c-447d-b252-83d920918c22/score.png)
andreasjansson/music-inpainting-bert
Music inpainting of melody and chords
![](https://tjzk.replicate.delivery/models_models_cover_image/d874501f-d4ed-4853-b341-be6490f12d4b/274296048-f9a69013-1b07-43be-9e8b.png)
sakemin/musicgen-remixer
Remix the music into another styles with MusicGen Chord
![](https://tjzk.replicate.delivery/models_models_featured_image/d3535f89-20c2-4b5f-b70b-f898e7429340/cantable-diffuguesion.jpeg)
andreasjansson/cantable-diffuguesion
Bach chorale generation and harmonization
![](https://tjzk.replicate.delivery/models_models_featured_image/a618f88b-df50-441c-974c-fab12e75bf69/Transparent_HarmonaiLogo-021.png)
harmonai/dance-diffusion
Tools to train a generative model on arbitrary audio samples
![](https://tjzk.replicate.delivery/models_models_cover_image/3fbae999-84cd-4347-bcc2-6456e715d22c/musicgen-choral.webp)
fofr/musicgen-choral
MusicGen fine-tuned on chamber choir music
![](https://tjzk.replicate.delivery/models_models_cover_image/6ebeaf90-c403-4ee0-bebb-2fd6f7f9ac26/mustango.jpg)
declare-lab/mustango
Controllable Text-to-Music Generation
annahung31/emopia
Emotional conditioned music generation using transformer-based model.
![](https://tjzk.replicate.delivery/models_models_cover_image/70ec790a-c702-4026-9596-4333ed27e777/musicgen-stereo-chord.png)
sakemin/musicgen-stereo-chord
Generate music in stereo, restricted to chord sequences and tempo
![](https://tjzk.replicate.delivery/models_models_cover_image/ceeee1b9-502d-4bd2-bd68-e824dfcd1a89/274288174-3bf1f386-d2fe-4a8d-8927.png)
sakemin/musicgen-chord
Generate music restricted to chord sequences and tempo
![](https://tjzk.replicate.delivery/models_models_featured_image/98ee001d-80f8-4562-956f-d8b07dd0a671/songstarter.jpeg)
nateraw/musicgen-songstarter-v0.2
A large, stereo MusicGen that acts as a useful tool for music producers
![](https://tjzk.replicate.delivery/models_models_cover_image/b958c60f-aa4a-44ac-b251-34317cd96245/magnet.jpeg)
lucataco/magnet
MAGNeT: Masked Audio Generation using a Single Non-Autoregressive Transformer
![](https://tjzk.replicate.delivery/models_models_cover_image/0e1a363d-b95a-467a-8d85-ad4a0716d3f5/epic.webp)
fofr/musicgen-epic
MusicGen fine-tuned on an epic orchestral style