minimax/music-2.6

Generate full-length songs or instrumentals from a text prompt, with optional auto-generated lyrics

103 runs

Music 2.6

Music 2.6 is MiniMax’s latest music generation model. Give it lyrics and a style description, and it generates a full-length song with vocals and instrumentation — or go instrumental-only with just a prompt.

What’s new in 2.6

Music 2.6 builds on Music 2.5 with major improvements:

  • BPM and key control — specify a key and BPM in your prompt (e.g. “E minor, 90 BPM”) and the output matches 99%+ of the time
  • Faster streaming — end-to-end chunk latency dropped from over 60 seconds to under 25 seconds
  • Longer songs — up to 6 minutes per generation (most songs land between 2 and 4 minutes)
  • Instrumental mode — generate music without vocals. Set is_instrumental to true and provide a prompt describing the style. No lyrics needed.
  • Automatic lyrics — set lyrics_optimizer to true and the model generates lyrics from your prompt, so you can create a complete song from just a style description.

Plus everything from 2.5:

  • Natural-sounding vocals with realistic timbre, breathing, and pitch transitions
  • Expanded sound library including orchestral and traditional instruments
  • 14+ section tags for precise structure control
  • Style-aware mixing that adapts to genre

Inputs

lyrics — The lyrics for your song, up to 3,500 characters. Required for vocal tracks (unless lyrics_optimizer is enabled). Use structure tags to control the arrangement:

[Intro], [Verse], [Pre Chorus], [Chorus], [Hook], [Drop], [Bridge], [Solo], [Build Up], [Inst], [Interlude], [Break], [Transition], [Outro]

Use \n to separate lines and \n\n to add pauses between sections.

prompt — A description of the music style, mood, and scenario. For example: E minor, 90 BPM, acoustic guitar ballad, male vocal, emotional. Up to 2,000 characters. Required for instrumental mode.

is_instrumental — Set to true to generate instrumental music without vocals. When enabled, prompt is required and lyrics is not needed. Default: false.

lyrics_optimizer — Set to true to auto-generate lyrics from the prompt when lyrics are empty. Default: false.

sample_rate — Audio sample rate. Options: 16000, 24000, 32000, 44100 (default).

bitrate — Audio bitrate. Options: 32000, 64000, 128000, 256000 (default).

audio_format — Output format: mp3 (default), wav, or pcm.

Quick start examples

Song with lyrics and BPM/key control

{
  "prompt": "E minor, 90 BPM, acoustic guitar ballad, male vocal, emotional",
  "lyrics": "[Verse]\nWalking through the rain...\n[Chorus]\nBut I still remember you"
}

Instrumental track

{
  "prompt": "Cinematic orchestral, epic and dramatic, full symphony",
  "is_instrumental": true
}

Auto-generated lyrics

{
  "prompt": "Upbeat pop, summer vibes, feel-good, catchy melody",
  "lyrics_optimizer": true
}

Prompt guide

The prompt field steers the overall sound of your track. Think of it as giving creative direction to a producer.

Prompt structure

A good prompt follows this general pattern:

[Key], [BPM], [Genre], [Mood/Emotion], [Vocal description], [Key instruments], [Production style]

You don’t need every element every time — pick the ones that matter for your track. Music 2.6 is especially good at matching exact BPM and key when you specify them.

What to include

Element What it does Examples
Key Sets the musical key E minor, C major, Bb minor, A major
BPM Controls the exact tempo 75 BPM, 90 BPM, 120 BPM, 140 BPM
Genre Sets the foundational sound Pop, Indie folk, Jazz, Blues, EDM, Hip-hop, Rock, Classical, Country, R&B
Mood / Emotion Shapes the emotional tone Melancholic, uplifting, aggressive, dreamy, hopeful, introspective, confident
Vocal style Guides the singer’s delivery Male vocals, female vocals, breathy, powerful, soulful, clear, operatic, raspy
Instruments Requests specific sounds Acoustic guitar, piano, synth bass, 808 drums, orchestral strings, brass section
Production / Mixing Shapes the sonic character Lo-fi, warm reverb, wide soundstage, intimate studio feel, distorted, crisp

Example prompts

Acoustic ballad with BPM/key: E minor, 90 BPM, acoustic guitar ballad, male vocal, emotional

Bright pop: C major, 120 BPM, bright pop, female vocal

Jazz piano: Bb minor, 75 BPM, jazz piano, smooth vocal

Electronic / Dance: Pop-Dance/Progressive House, uplifting, anthemic, 125 BPM, four-on-the-floor kick, synth bass, atmospheric pads

Cinematic instrumental: Cinematic orchestral, epic, sweeping strings, building tension, heroic brass, 90 BPM, wide soundstage

Lo-fi: Lo-fi hip-hop, chill, study vibes, vinyl texture, warm midrange

Structure tags

Structure tags let you design the emotional arc and arrangement of your song. Vocals and instruments evolve dynamically across sections — vocal emotion and technique shift from verse to chorus, instrumentation adjusts per section.

Tag Purpose When to use it
[Intro] Song opening Setting mood before vocals kick in
[Verse] Story / narrative sections Main lyrical content
[Pre Chorus] Build-up before chorus Escalating tension
[Chorus] Hook / memorable section The repeating main idea
[Post Chorus] After-hook section Extended hook variation
[Hook] Catchy standalone phrase A memorable moment
[Drop] Energy release (EDM) After a build-up, the beat drops
[Bridge] Contrast section Breaking repetition, new perspective
[Solo] Instrumental spotlight Showcasing a specific instrument
[Inst] Instrumental section Music without vocals
[Build Up] Intensity increase Leading to a drop or climax
[Interlude] Instrumental break Breathing space between sections
[Break] Rhythmic pause Dynamic contrast
[Transition] Section connector Smooth flow between parts
[Outro] Song ending Graceful exit, fade-out

Good to know

  • Max song length: Up to 6 minutes per generation. Most songs land between 2 and 4 minutes.
  • BPM/key accuracy: Specify exact BPM and key in the prompt and the output matches 99%+ of the time.
  • Language support: English and Mandarin Chinese have the strongest support. Other languages work but with less consistent pronunciation.
  • Each generation is unique. The same prompt and lyrics produce different arrangements each time.
  • Reproducibility: Use a seed parameter (0–1,000,000) for deterministic output.
  • For instrumental tracks, set is_instrumental to true — or use [Inst] tags and parenthetical instrument directions in the lyrics.
  • Higher quality settings (44100 sample rate, 256000 bitrate) give the best audio quality. Use wav format for production work.

Also check out minimax/music-cover to reimagine existing songs in a different style while preserving the original melody.

Privacy policy

Data from this model is sent from Replicate to MiniMax.

Check their privacy policy for details:

https://www.minimax.io/platform/protocol/privacy-policy

Terms of service

https://www.minimax.io/platform/protocol/terms-of-service

Model created
Model updated