gianpaj / cog-orpheus-3b-0.1-ft

Spanish and English Text to Speech model from Canopy Labs (3b-es_it-ft-research_release)

  • Public
  • 153 runs
  • L40S
  • GitHub
  • Weights
Iterate in playground

Input

*string
Shift + Return to add a new line

Text to convert to speech

string

Voice to use

Default: "javi"

number
(minimum: 0.1, maximum: 1.5)

Temperature for generation

Default: 0.6

number
(minimum: 0.1, maximum: 1)

Top P for nucleus sampling

Default: 0.95

number
(minimum: 1, maximum: 2)

Repetition penalty

Default: 1.1

integer
(minimum: 100, maximum: 2000)

Maximum number of tokens to generate

Default: 1200

Output

Video Player is loading.
Current Time 00:00:000
Duration 00:00:000
Loaded: 0%
Stream Type LIVE
Remaining Time 00:00:000
 
1x
Generated in

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Spanish and Italian model: 3b-es_it-ft-research_release https://huggingface.co/canopylabs/3b-es_it-ft-research_release

Orpheus 3B 0.1 Finetuned

Note on emotional tags: - Italian supports sigh, laugh, cough, sniffle, groan, yawn, gemito, gasp - Spanish supports groan, chuckle, gasp, resoplido, laugh, yawn, cough

More info: https://canopylabs.ai/releases/orpheus_can_speak_any_language


Orpheus TTS is a state-of-the-art, Llama-based Speech-LLM designed for high-quality, empathetic text-to-speech generation. This model has been finetuned to deliver human-level speech synthesis, achieving exceptional clarity, expressiveness, and real-time streaming performances.

Model Details

Model Capabilities

  • Human-Like Speech: Natural intonation, emotion, and rhythm that is superior to SOTA closed source models
  • Zero-Shot Voice Cloning: Clone voices without prior fine-tuning
  • Guided Emotion and Intonation: Control speech and emotion characteristics with simple tags
  • Low Latency: ~200ms streaming latency for realtime applications, reducible to ~100ms with input streaming

Model Sources

Model Misuse

Do not use our models for impersonation without consent, misinformation or deception (including fake news or fraudulent calls), or any illegal or harmful activity. By using this model, you agree to follow all applicable laws and ethical guidelines. We disclaim responsibility for any use.