You're looking at a specific version of this model. Jump to the model overview.
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
text |
string
|
Text to generate speech from
|
|
audio |
string
|
Path to audio file for voice cloning (optional)
|
|
language |
string
(enum)
|
en-us
Options: en-us, en-gb, ja, cmn, yue, fr-fr, de |
Language code for speech generation
|
model_type |
string
(enum)
|
transformer
Options: transformer, hybrid |
Model type to use
|
emotion |
string
|
|
Optionally pass a comma-separated list of 8 floats for your desired emotion vector
in the order [Happiness, Sadness, Disgust, Fear, Surprise, Anger, Other, Neutral].
For example: '0.5,0.2,0.0,0.0,0.3,0.1,0.0,0.0'.
If empty or invalid, defaults to the built-in neutralish emotion.
|
speaking_rate |
number
|
15
Min: 5 Max: 30 |
Speaking rate in phonemes per second. Default is 15.0.
10-12 is slow and clear, 15-17 is natural conversational,
20+ is fast. Values above 30 may produce artifacts.
|
seed |
integer
|
Seed for reproducibility (optional)
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}