You're looking at a specific version of this model. Jump to the model overview.
tmappdev /cosy_voice_cloner:5c6a1398
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
reference_audio |
string
|
Path to reference audio (3-10s)
|
|
text |
string
|
Text to synthesize
|
|
language |
string
(enum)
|
English
Options: Chinese, English, Japanese, Korean, Cantonese, Mixed |
Language mode
|
split_method |
string
(enum)
|
By Sentences (4 each)
Options: None, By Sentences (4 each), By Length (~50 chars), By Chinese Full Stop (。), By English Full Stop (.), By Any Punctuation |
Text splitting method
|
speed |
number
|
1
|
Speech speed (1.0 is normal speed)
|
top_k |
integer
|
20
|
Top-K sampling
|
top_p |
number
|
0.6
|
Top-P sampling
|
temperature |
number
|
0.6
|
Sampling temperature
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}