charlesmccarthy / musicgen

MusicGen running on an a40 with 60 seconds max duration (Updated 11 months, 3 weeks ago)

  • Public
  • 1.2K runs
  • L40S
Iterate in playground

Input

string

Model to use for generation

Default: "stereo-melody-large"

string
Shift + Return to add a new line

A description of the music you want to generate.

file

An audio file that will influence the generated music. If `continuation` is `True`, the generated music will be a continuation of the audio file. Otherwise, the generated music will mimic the audio file's melody.

integer

Duration of the generated audio in seconds.

Default: 8

boolean

If `True`, generated music will continue from `input_audio`. Otherwise, generated music will mimic `input_audio`'s melody.

Default: false

integer
(minimum: 0)

Start time of the audio file to use for continuation.

Default: 0

integer
(minimum: 0)

End time of the audio file to use for continuation. If -1 or None, will default to the end of the audio clip.

boolean

If `True`, the EnCodec tokens will be decoded with MultiBand Diffusion. Only works with non-stereo models.

Default: false

string

Strategy for normalizing audio.

Default: "loudness"

integer

Reduces sampling to the k most likely tokens.

Default: 250

number

Reduces sampling to tokens with cumulative probability of p. When set to `0` (default), top_k sampling is used.

Default: 0

number

Controls the 'conservativeness' of the sampling process. Higher temperature means more diversity.

Default: 1

integer

Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.

Default: 3

string

Output format for generated audio.

Default: "wav"

integer

Seed for random number generator. If None or -1, a random seed will be used.

Output

Video Player is loading.
Current Time 00:00:000
Duration 00:00:000
Loaded: 0%
Stream Type LIVE
Remaining Time 00:00:000
 
1x
Generated in

Run time and cost

This model costs approximately $0.095 to run on Replicate, or 10 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 98 seconds. The predict time for this model varies significantly based on the inputs.

Readme

I made this to make a cheaper inference (a40 vs h100) and increase the max duration from 30 seconds to 60 seconds. No other changes have been made from the official cog.