Collections

Generate music

What you can do

Generate full songs with vocals and lyrics from a text prompt.

Create instrumentals across any genre — electronic, orchestral, jazz, lo-fi, rock, and more.

Use a reference track to guide the style of your generated music.

Generate sound effects and ambient audio alongside music.

Models we recommend

For full songs with vocals

MiniMax Music 2.5 is the newest MiniMax music model with improved vocal quality — natural-sounding singing with realistic timbre, breathing, and pitch transitions. Features 14+ section tags (Intro, Verse, Chorus, Hook, Drop, Bridge, Solo, and more) for precise structure control and style-aware mixing that adjusts characteristics based on genre. Supports lyrics up to 3,500 characters.

MiniMax Music 1.5 produces tracks up to 4 minutes long with vocals in English and Chinese. Write lyrics with structure tags like [verse], [chorus], and [bridge], and optionally upload a reference track (5–30 seconds) to guide the style.

ElevenLabs Music generates studio-grade songs up to 5 minutes from a text description. Toggle between vocal and instrumental output with force_instrumental. It handles detailed composition plans — specify genre, mood, tempo, instrumentation, and song structure in your prompt.

For instrumentals and background music

Stable Audio 2.5 generates high-quality instrumentals and sound effects up to about 3 minutes from text prompts. It also supports audio inpainting and continuation — feed it a clip and it'll extend or fill gaps seamlessly. Open-source weights mean you can self-host it.

ElevenLabs Music with force_instrumental set to true produces clean instrumentals across any genre. Describe what you want in plain language.

Google Lyria 2 produces 30-second clips at 48kHz stereo — the highest audio fidelity in this collection. Supports negative prompts to exclude unwanted elements. Best for short loops, jingles, and audio samples where pristine sound quality matters.

For style-guided generation

MiniMax Music 1.5 lets you upload a reference audio clip (5–30 seconds) and control how much it influences the output with a style strength parameter (0.0 to 1.0). Great for creating variations on an existing track.

MiniMax Music 01 is the predecessor — faster and simpler, generating up to 60 seconds from a reference track and lyrics.

For open-source and self-hosting

Stable Audio 2.5 is built on Stability AI's open-source stable-audio-tools and weights are publicly available.

ACE-Step generates full songs with vocals in about 20 seconds on an A100. Uses a diffusion-based architecture that's 15× faster than autoregressive approaches. Supports lyrics with structure tags and natural-language style descriptions.

Meta MusicGen generates music from text prompts or melody conditioning. The melody variant lets you hum or play a tune and generate a full arrangement around it.

Try it out

Test different music models in the playground. Compare outputs side by side to find the right sound for your project.

Open the playground →

Questions? Join us on Discord.