You're looking at a specific version of this model. Jump to the model overview.

sakemin /musicgen-fine-tuner:055c82cf

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
prompt
string
A description of the music you want to generate.
input_audio
string
An audio file that will influence the generated music. If `continuation` is `True`, the generated music will be a continuation of the audio file. Otherwise, the generated music will mimic the audio file's melody.
duration
integer
8
Duration of the generated audio in seconds.
continuation
boolean
False
If `True`, generated music will continue `melody`. Otherwise, generated music will mimic `audio_input`'s melody.
continuation_start
integer
0
Start time of the audio file to use for continuation.
continuation_end
integer
End time of the audio file to use for continuation. If -1 or None, will default to the end of the audio clip.
multi_band_diffusion
boolean
False
If `True`, the EnCodec tokens will be decoded with MultiBand Diffusion.
normalization_strategy
string (enum)
loudness

Options:

loudness, clip, peak, rms

Strategy for normalizing audio.
top_k
integer
250
Reduces sampling to the k most likely tokens.
top_p
number
0
Reduces sampling to tokens with cumulative probability of p. When set to `0` (default), top_k sampling is used.
temperature
number
1
Controls the 'conservativeness' of the sampling process. Higher temperature means more diversity.
classifier_free_guidance
integer
3
Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.
output_format
string (enum)
wav

Options:

wav, mp3

Output format for generated audio.
seed
integer
Seed for random number generator. If None or -1, a random seed will be used.

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}