tmappdev / cosy_voice_cloner

  • Public
  • 52 runs
  • A100 (80GB)

Input

*file

Reference audio file (3-10 seconds)

string
Shift + Return to add a new line

Text of the reference audio (optional)

Default: ""

string

Language of reference audio

Default: "粤语"

*string
Shift + Return to add a new line

Text to synthesize

string

Language of the text to synthesize

Default: "粤语"

string

How to split text

Default: "按标点符号切"

integer
(minimum: 1, maximum: 100)

GPT top_k parameter

Default: 15

number
(minimum: 0, maximum: 1)

GPT top_p parameter

Default: 1

number
(minimum: 0, maximum: 1)

GPT temperature parameter

Default: 1

boolean

Enable reference-free mode

Default: false

number
(minimum: 0.6, maximum: 1.65)

Speech speed adjustment

Default: 1

file[]

Optional additional reference files to blend

Output

No output yet! Press "Submit" to start a prediction.

Run time and cost

This model runs on Nvidia A100 (80GB) GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This model doesn't have a readme.