You're looking at a specific version of this model. Jump to the model overview.

jaaari /zonos:8fd66c08

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
text
string
Text to generate speech from
audio
string
Path to audio file for voice cloning (optional)
language
string (enum)
en-us

Options:

en-us, en-gb, ja, cmn, yue, fr-fr, de

Language code for speech generation
model_type
string (enum)
transformer

Options:

transformer, hybrid

Model type to use
emotion
string
Optionally pass a comma-separated list of 8 floats for your desired emotion vector in the order [Happiness, Sadness, Disgust, Fear, Surprise, Anger, Other, Neutral]. For example: '0.5,0.2,0.0,0.0,0.3,0.1,0.0,0.0'. If empty or invalid, defaults to the built-in neutralish emotion.
speaking_rate
number
15

Min: 5

Max: 30

Speaking rate in phonemes per second. Default is 15.0. 10-12 is slow and clear, 15-17 is natural conversational, 20+ is fast. Values above 30 may produce artifacts.
seed
integer
Seed for reproducibility (optional)

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}