You're looking at a specific version of this model. Jump to the model overview.
resemble-ai /chatterbox-turbo:8ff2cbe2
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description |
|---|---|---|---|
| text |
string
|
Text to synthesize into speech (maximum 500 characters).
Supported paralinguistic tags you can include in your text:
[clear throat], [sigh], [sush], [cough], [groan], [sniff], [gasp], [chuckle], [laugh]
Example: "Oh, that's hilarious! [chuckle] Let me tell you more."
|
|
| voice |
None
|
Andy
|
Pre-made voice to use for synthesis. Ignored if reference_audio is provided.
|
| reference_audio |
string
|
Reference audio file for voice cloning (optional). Must be longer than 5 seconds. If provided, overrides the voice selection.
|
|
| temperature |
number
|
0.8
Min: 0.05 Max: 2 |
Controls randomness in generation. Higher values produce more varied speech.
|
| top_p |
number
|
0.95
Min: 0.5 Max: 1 |
Nucleus sampling threshold. Lower values make output more focused.
|
| top_k |
integer
|
1000
Min: 1 Max: 2000 |
Top-k sampling. Limits vocabulary to top k tokens at each step.
|
| repetition_penalty |
number
|
1.2
Min: 1 Max: 2 |
Penalizes token repetition. Higher values reduce repetition.
|
| seed |
integer
|
Random seed for reproducible results. Leave blank for random generation.
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}