You're looking at a specific version of this model. Jump to the model overview.

tmappdev /cosy_voice_cloner:51a8d8dd

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
ref_audio
string
Reference audio file (3-10 seconds)
prompt_text
string
Text of the reference audio (optional)
prompt_language
None
粤语
Language of reference audio
text
string
Text to synthesize
text_language
None
粤语
Language of the text to synthesize
how_to_cut
None
按标点符号切
How to split text
top_k
integer
15

Min: 1

Max: 100

GPT top_k parameter
top_p
number
1

Max: 1

GPT top_p parameter
temperature
number
1

Max: 1

GPT temperature parameter
ref_free
boolean
False
Enable reference-free mode
speed
number
1

Min: 0.6

Max: 1.65

Speech speed adjustment
reference_files
array
Optional additional reference files to blend

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}