You're looking at a specific version of this model. Jump to the model overview.
tmappdev /cosy_voice_cloner:51a8d8dd
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
ref_audio |
string
|
Reference audio file (3-10 seconds)
|
|
prompt_text |
string
|
|
Text of the reference audio (optional)
|
prompt_language |
string
(enum)
|
粤语
Options: 中文, 英文, 日文, 粤语, 韩文, 中英混合, 日英混合, 粤英混合, 韩英混合, 多语种混合, 多语种混合(粤语) |
Language of reference audio
|
text |
string
|
Text to synthesize
|
|
text_language |
string
(enum)
|
粤语
Options: 中文, 英文, 日文, 粤语, 韩文, 中英混合, 日英混合, 粤英混合, 韩英混合, 多语种混合, 多语种混合(粤语) |
Language of the text to synthesize
|
how_to_cut |
string
(enum)
|
按标点符号切
Options: 不切, 凑四句一切, 凑50字一切, 按中文句号。切, 按英文句号.切, 按标点符号切 |
How to split text
|
top_k |
integer
|
15
Min: 1 Max: 100 |
GPT top_k parameter
|
top_p |
number
|
1
Max: 1 |
GPT top_p parameter
|
temperature |
number
|
1
Max: 1 |
GPT temperature parameter
|
ref_free |
boolean
|
False
|
Enable reference-free mode
|
speed |
number
|
1
Min: 0.6 Max: 1.65 |
Speech speed adjustment
|
reference_files |
array
|
Optional additional reference files to blend
|
Output schema
The shape of the response you’ll get when you run this model with an API.
{'format': 'uri', 'title': 'Output', 'type': 'string'}