jimothyjohn/stable-audio-3-medium

stable-audio-3-medium

Public
3 runs

Run jimothyjohn/stable-audio-3-medium with an API

Use one of our client libraries to get started quickly. Clicking on a library will take you to the Playground tab where you can tweak different inputs, see the results, and copy the corresponding code to use in your own project.

Input schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

Field Type Default value Description
prompt
string
Text description of the audio to generate.
negative_prompt
string
Qualities to avoid in the output. Only affects -base models.
duration
number
30

Min: 1

Max: 380

Length of the generated audio in seconds.
steps
integer
8

Min: 4

Max: 50

Number of diffusion sampling steps. 8 is the post-trained default; lower is faster, higher rarely helps.
cfg_scale
number
1

Min: 0.5

Max: 15

Classifier-free guidance scale. Only affects -base models.
seed
integer
-1
Random seed. -1 selects a new random seed for each run.
init_audio
string
Optional source audio for audio-to-audio editing.
init_noise_level
number
0.9

Max: 1

How much the init audio influences the output. 1.0 = pure generation, lower keeps more of the original.
inpaint_audio
string
Optional source audio for inpainting or continuation. Set inpaint_start_seconds and inpaint_end_seconds to mark the region to regenerate; set start to the file length to extend it.
inpaint_start_seconds
number
0
Start of the inpaint region in seconds (used with inpaint_audio).
inpaint_end_seconds
number
0
End of the inpaint region in seconds (used with inpaint_audio).
output_format
None
wav
Output file format.

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{
  "type": "string",
  "title": "Output",
  "format": "uri"
}