charlesmccarthy / musicgen

MusicGen running on an a40 with 60 seconds max duration

  • Public
  • 834 runs
  • L40S

Input

string

Model to use for generation

Default: "stereo-melody-large"

string
Shift + Return to add a new line

A description of the music you want to generate.

file

An audio file that will influence the generated music. If `continuation` is `True`, the generated music will be a continuation of the audio file. Otherwise, the generated music will mimic the audio file's melody.

integer

Duration of the generated audio in seconds.

Default: 8

boolean

If `True`, generated music will continue from `input_audio`. Otherwise, generated music will mimic `input_audio`'s melody.

Default: false

integer
(minimum: 0)

Start time of the audio file to use for continuation.

Default: 0

integer
(minimum: 0)

End time of the audio file to use for continuation. If -1 or None, will default to the end of the audio clip.

boolean

If `True`, the EnCodec tokens will be decoded with MultiBand Diffusion. Only works with non-stereo models.

Default: false

string

Strategy for normalizing audio.

Default: "loudness"

integer

Reduces sampling to the k most likely tokens.

Default: 250

number

Reduces sampling to tokens with cumulative probability of p. When set to `0` (default), top_k sampling is used.

Default: 0

number

Controls the 'conservativeness' of the sampling process. Higher temperature means more diversity.

Default: 1

integer

Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.

Default: 3

string

Output format for generated audio.

Default: "wav"

integer

Seed for random number generator. If None or -1, a random seed will be used.

Output

Video Player is loading.
Current Time 00:00:000
Duration 00:00:000
Loaded: 0%
Stream Type LIVE
Remaining Time 00:00:000
 
1x
Generated in

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

I made this to make a cheaper inference (a40 vs h100) and increase the max duration from 30 seconds to 60 seconds. No other changes have been made from the official cog.