You're looking at a specific version of this model. Jump to the model overview.

zsxkib /realistic-voice-cloning:ab6f63c8

Input

string
Shift + Return to add a new line

Link to a song on YouTube or path to a local audio file. Should be enclosed in double quotes for Windows and single quotes for Unix-like systems.

string
Shift + Return to add a new line

Name of folder in rvc_models directory containing your .pth and .index files for a specific voice.

number

Change pitch of AI vocals in octaves. Set to 0 for no change. Generally, use 1 for male to female conversions and -1 for vice-versa.

Default: 0

boolean

Can be added to keep all intermediate audio files generated. e.g. Isolated AI vocals/instrumentals. Leave out to save space.

Default: false

number

Control how much of the AI's accent to leave in the vocals. 0 <= INDEX_RATE <= 1.

Default: 0.5

integer

If >=3: apply median filtering median filtering to the harvested pitch results. 0 <= FILTER_RADIUS <= 7.

Default: 3

number

Control how much to use the original vocal's loudness (0) or a fixed loudness (1). 0 <= RMS_MIX_RATE <= 1.

Default: 0.25

string

Best option is rmvpe (clarity in vocals), then mangio-crepe (smoother vocals).

Default: "rmvpe"

integer

Controls how often it checks for pitch changes in milliseconds when using `mangio-crepe` algo specifically. Lower values leads to longer conversions and higher risk of voice cracks, but better pitch accuracy.

Default: 128

number

Control how much of the original vocals' breath and voiceless consonants to leave in the AI vocals. Set 0.5 to disable. 0 <= PROTECT <= 0.5.

Default: 0.33

number

Control volume of main AI vocals. Use -3 to decrease the volume by 3 decibels, or 3 to increase the volume by 3 decibels.

Default: 0

number

Control volume of backup AI vocals.

Default: 0

number

Control volume of the background music/instrumentals.

Default: 0

number

Change pitch/key of background music, backup vocals and AI vocals in semitones. Reduces sound quality slightly.

Default: 0

number
(minimum: 0, maximum: 1)

The larger the room, the longer the reverb time. 0 <= REVERB_SIZE <= 1.

Default: 0.15

number
(minimum: 0, maximum: 1)

Level of AI vocals with reverb. 0 <= REVERB_WETNESS <= 1.

Default: 0.2

number
(minimum: 0, maximum: 1)

Level of AI vocals without reverb. 0 <= REVERB_DRYNESS <= 1.

Default: 0.8

number
(minimum: 0, maximum: 1)

Absorption of high frequencies in the reverb. 0 <= REVERB_DAMPING <= 1.

Default: 0.7

string

wav for best quality and large file size, mp3 for decent quality and small file size.

Default: "mp3"

Output

No output yet! Press "Submit" to start a prediction.