You're looking at a specific version of this model. Jump to the model overview.

resemble-ai /chatterbox-turbo:8ff2cbe2

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
text
string
Text to synthesize into speech (maximum 500 characters). Supported paralinguistic tags you can include in your text: [clear throat], [sigh], [sush], [cough], [groan], [sniff], [gasp], [chuckle], [laugh] Example: "Oh, that's hilarious! [chuckle] Let me tell you more."
voice
None
Andy
Pre-made voice to use for synthesis. Ignored if reference_audio is provided.
reference_audio
string
Reference audio file for voice cloning (optional). Must be longer than 5 seconds. If provided, overrides the voice selection.
temperature
number
0.8

Min: 0.05

Max: 2

Controls randomness in generation. Higher values produce more varied speech.
top_p
number
0.95

Min: 0.5

Max: 1

Nucleus sampling threshold. Lower values make output more focused.
top_k
integer
1000

Min: 1

Max: 2000

Top-k sampling. Limits vocabulary to top k tokens at each step.
repetition_penalty
number
1.2

Min: 1

Max: 2

Penalizes token repetition. Higher values reduce repetition.
seed
integer
Random seed for reproducible results. Leave blank for random generation.

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}