pseudoram / rvc-v2

Speech to speech with any RVC v2 trained AI voice

  • Public
  • 623.1K runs
  • T4
  • GitHub
  • License

Input

Video Player is loading.
Current Time 00:00:000
Duration 00:00:000
Loaded: 0%
Stream Type LIVE
Remaining Time 00:00:000
 
1x
file

Upload your audio file here.

string

RVC model for a specific voice. If using a custom model, this should match the name of the downloaded model. If a 'custom_rvc_model_download_url' is provided, this will be automatically set to the name of the downloaded model.

Default: "Obama"

string
Shift + Return to add a new line

URL to download a custom RVC model. If provided, the model will be downloaded (if it doesn't already exist) and used for prediction, regardless of the 'rvc_model' value.

number

Adjust pitch of AI vocals in semitones. Use positive values to increase pitch, negative to decrease.

Default: 0

number
(minimum: 0, maximum: 1)

Control how much of the AI's accent to leave in the vocals.

Default: 0.5

integer
(minimum: 0, maximum: 7)

If >=3: apply median filtering to the harvested pitch results.

Default: 3

number
(minimum: 0, maximum: 1)

Control how much to use the original vocal's loudness (0) or a fixed loudness (1).

Default: 0.25

string

Pitch detection algorithm. 'rmvpe' for clarity in vocals, 'mangio-crepe' for smoother vocals.

Default: "rmvpe"

integer

When `f0_method` is set to `mangio-crepe`, this controls how often it checks for pitch changes in milliseconds.

Default: 128

number
(minimum: 0, maximum: 0.5)

Control how much of the original vocals' breath and voiceless consonants to leave in the AI vocals. Set 0.5 to disable.

Default: 0.33

string

wav for best quality and large file size, mp3 for decent quality and small file size.

Default: "mp3"

Output

Video Player is loading.
Current Time 00:00:000
Duration 00:00:000
Loaded: 0%
Stream Type LIVE
Remaining Time 00:00:000
 
1x
Generated in

Run time and cost

This model costs approximately $0.0029 to run on Replicate, or 344 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 13 seconds.

Readme

RVC Voice Transformer v2

Transform any voice with the power of RVC v2 models!

🌟 Highlights

  • User-friendly
  • Custom RVC v2 model support
  • Fine-tuned voice conversion settings

🔍 Model Hunt

Discover new voices on platforms like Hugging Face. Find, import, and start transforming!

⚠️ Use Responsibly

Avoid using for:

  • Personal attacks
  • Ideological propaganda
  • Unauthorized impersonation
  • Fraudulent activities

Remember: With great power comes great responsibility!

🙏 Credits

Built upon SociallyIneptWeeb’s AICoverGen, inspired by zsxkib, reimagined by PseudoRAM.

📜 License

MIT License - See LICENSE for the fine print.


Disclaimer: Creator not responsible for misuse. Use at your own risk!