You're looking at a specific version of this model. Jump to the model overview.
zsxkib /tortoise-then-rvc:775570f6
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
text |
string
|
The expressiveness of autoregressive transformers is literally nuts! I absolutely adore them.
|
TorToiSe: Text to speak.
|
voice_a |
string
(enum)
|
random
Options: angie, cond_latent_example, deniro, freeman, halle, lj, myself, pat2, snakes, tom, train_daws, train_dreams, train_grace, train_lescault, weaver, applejack, daniel, emma, geralt, jlaw, mol, pat, rainbow, tim_reynolds, train_atkins, train_dotrice, train_empire, train_kennard, train_mouse, william, random, custom_voice, disabled |
TorToiSe: Selects the voice to use for generation. Use `random` to select a random voice. Use `custom_voice` to use a custom voice.
|
custom_voice |
string
|
TorToiSe: (Optional) Create a custom voice based on an mp3 file of a speaker. Audio should be at least 15 seconds, only contain one speaker, and be in mp3 format. Overrides the `voice_a` input.
|
|
voice_b |
string
(enum)
|
disabled
Options: angie, cond_latent_example, deniro, freeman, halle, lj, myself, pat2, snakes, tom, train_daws, train_dreams, train_grace, train_lescault, weaver, applejack, daniel, emma, geralt, jlaw, mol, pat, rainbow, tim_reynolds, train_atkins, train_dotrice, train_empire, train_kennard, train_mouse, william, random, custom_voice, disabled |
TorToiSe: (Optional) Create new voice from averaging the latents for `voice_a`, `voice_b` and `voice_c`. Use `disabled` to disable voice mixing.
|
voice_c |
string
(enum)
|
disabled
Options: angie, cond_latent_example, deniro, freeman, halle, lj, myself, pat2, snakes, tom, train_daws, train_dreams, train_grace, train_lescault, weaver, applejack, daniel, emma, geralt, jlaw, mol, pat, rainbow, tim_reynolds, train_atkins, train_dotrice, train_empire, train_kennard, train_mouse, william, random, custom_voice, disabled |
TorToiSe: (Optional) Create new voice from averaging the latents for `voice_a`, `voice_b` and `voice_c`. Use `disabled` to disable voice mixing.
|
preset |
string
(enum)
|
fast
Options: ultra_fast, fast, standard, high_quality |
Which voice preset to use. See the documentation for more information.
|
seed |
integer
|
0
|
TorToiSe: Random seed which can be used to reproduce results.
|
cvvp_amount |
number
|
0
Max: 1 |
TorToiSe: How much the CVVP model should influence the output. Increasing this can in some cases reduce the likelyhood of multiple speakers. Defaults to 0 (disabled)
|
pre_process_with_rvc |
boolean
|
True
|
Use Realistic Voice Cloning v2 (RVCv2) to further enhance the voice created by TorToiSe Text-To-Speech
|
rvc_model |
string
(enum)
|
CUSTOM
Options: CUSTOM, Squidward, MrKrabs, Plankton, Drake, Vader, Trump, Biden, Obama, Guitar, Voilin |
RVC model for a specific voice. If using a custom model, this should match the name of the downloaded model. If a 'custom_rvc_model_download_url' is provided, this will be automatically set to the name of the downloaded model.
|
custom_rvc_model_download_url |
string
|
RVC: (When `pre_process_with_rvc=True`) URL to download a custom RVC model. If provided, the model will be downloaded (if it doesn't already exist) and used for prediction, regardless of the 'rvc_model' value.
|
|
pitch_change |
string
(enum)
|
no-change
Options: no-change, male-to-female, female-to-male |
RVC: (When `pre_process_with_rvc=True`) Adjust pitch of AI vocals. Options: `no-change`, `male-to-female`, `female-to-male`.
|
index_rate |
number
|
0.5
Max: 1 |
RVC: (When `pre_process_with_rvc=True`) Control how much of the AI's accent to leave in the vocals.
|
filter_radius |
integer
|
3
Max: 7 |
RVC: (When `pre_process_with_rvc=True`) If >=3: apply median filtering median filtering to the harvested pitch results.
|
rms_mix_rate |
number
|
0.25
Max: 1 |
RVC: (When `pre_process_with_rvc=True`) Control how much to use the original vocal's loudness (0) or a fixed loudness (1).
|
pitch_detection_algorithm |
string
(enum)
|
rmvpe
Options: rmvpe, mangio-crepe |
RVC: (When `pre_process_with_rvc=True`) Best option is rmvpe (clarity in vocals), then mangio-crepe (smoother vocals).
|
crepe_hop_length |
integer
|
128
|
RVC: (When `pre_process_with_rvc=True`) When `pitch_detection_algo` is set to `mangio-crepe`, this controls how often it checks for pitch changes in milliseconds. Lower values lead to longer conversions and higher risk of voice cracks, but better pitch accuracy.
|
protect |
number
|
0.33
Max: 0.5 |
RVC: (When `pre_process_with_rvc=True`) Control how much of the original vocals' breath and voiceless consonants to leave in the AI vocals. Set 0.5 to disable.
|
main_vocals_volume_change |
number
|
0
|
RVC: (When `pre_process_with_rvc=True`) Control volume of main AI vocals. Use -3 to decrease the volume by 3 decibels, or 3 to increase the volume by 3 decibels.
|
backup_vocals_volume_change |
number
|
0
|
RVC: (When `pre_process_with_rvc=True`) Control volume of backup AI vocals.
|
instrumental_volume_change |
number
|
0
|
RVC: (When `pre_process_with_rvc=True`) Control volume of the background music/instrumentals.
|
pitch_change_all |
number
|
0
|
RVC: (When `pre_process_with_rvc=True`) Change pitch/key of background music, backup vocals and AI vocals in semitones. Reduces sound quality slightly.
|
reverb_size |
number
|
0.15
Max: 1 |
RVC: (When `pre_process_with_rvc=True`) The larger the room, the longer the reverb time.
|
reverb_wetness |
number
|
0.2
Max: 1 |
RVC: (When `pre_process_with_rvc=True`) Level of AI vocals with reverb.
|
reverb_dryness |
number
|
0.8
Max: 1 |
RVC: (When `pre_process_with_rvc=True`) Level of AI vocals without reverb.
|
reverb_damping |
number
|
0.7
Max: 1 |
RVC: (When `pre_process_with_rvc=True`) Absorption of high frequencies in the reverb.
|
output_format |
string
(enum)
|
mp3
Options: mp3, wav |
RVC: (When `pre_process_with_rvc=True`) wav for best quality and large file size, mp3 for decent quality and small file size.
|
Output schema
The shape of the response you’ll get when you run this model with an API.
{'items': {'format': 'uri', 'type': 'string'},
'title': 'Output',
'type': 'array',
'x-cog-array-type': 'iterator'}