You're looking at a specific version of this model. Jump to the model overview.
jimothyjohn /stable-audio-3-medium:8a54f4dd
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description |
|---|---|---|---|
| prompt |
string
|
Text description of the audio to generate.
|
|
| negative_prompt |
string
|
|
Qualities to avoid in the output. Only affects -base models.
|
| duration |
number
|
30
Min: 1 Max: 380 |
Length of the generated audio in seconds.
|
| steps |
integer
|
8
Min: 4 Max: 50 |
Number of diffusion sampling steps. 8 is the post-trained default; lower is faster, higher rarely helps.
|
| cfg_scale |
number
|
1
Min: 0.5 Max: 15 |
Classifier-free guidance scale. Only affects -base models.
|
| seed |
integer
|
-1
|
Random seed. -1 selects a new random seed for each run.
|
| init_audio |
string
|
Optional source audio for audio-to-audio editing.
|
|
| init_noise_level |
number
|
0.9
Max: 1 |
How much the init audio influences the output. 1.0 = pure generation, lower keeps more of the original.
|
| inpaint_audio |
string
|
Optional source audio for inpainting or continuation. Set inpaint_start_seconds and inpaint_end_seconds to mark the region to regenerate; set start to the file length to extend it.
|
|
| inpaint_start_seconds |
number
|
0
|
Start of the inpaint region in seconds (used with inpaint_audio).
|
| inpaint_end_seconds |
number
|
0
|
End of the inpaint region in seconds (used with inpaint_audio).
|
| output_format |
None
|
wav
|
Output file format.
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}