State-of-the-art TTS system for high-quality Hindi and bilingual (Hindi-English) speech synthesis. Built to deliver natural-sounding voices with emotional depth and clarity.
Available Voices
Female Voices
vinaya_assist
- Professional assistant voice (default)charu_soft
- Gentle storyteller’s voicekeerti_joy
- High-energy, vibrant tonemohini_whispers
- Whisper-soft, ASMR optimizedmaitri_connect
- Expressive narration voicesoumya_calm
- Smooth, soothing voice
Male Voices
varun_chat
- Deep, authoritative voiceagastya_impact
- Bold, confident presence
Parameters
Parameter | Default | Range | Description |
---|---|---|---|
text |
(required) | Min 30 chars | Text to convert to speech |
speaker |
vinaya_assist |
See voices above | Voice selection |
temperature |
0.5 |
0.0-2.0 | Controls randomness |
repetition_penalty |
1.1 |
1.0-2.0 | Reduces repetition |
top_p |
0.9 |
0.0-1.0 | Nucleus sampling |
max_new_tokens |
800 |
100-4096 | Token limit |
normalize_text |
true |
true/false | Auto text normalization |
seed |
null |
Any integer | For reproducible output |
output_format |
wav |
wav/opus/webm | Audio format |
Use Cases
- Audiobooks & Narration - Use
charu_soft
for storytelling - Virtual Assistants - Use
vinaya_assist
for professional responses - News & Announcements - Use
varun_chat
for authoritative delivery - ASMR & Meditation - Use
mohini_whispers
for calming content - Educational Content - Use
keerti_joy
for engaging presentations - Corporate Presentations - Use
agastya_impact
for impactful delivery
Tips
- Text Length: Optimal 50-500 characters per request
- Consistency: Use temperature 0.2-0.4 for consistent output
- Reproducibility: Set a seed value for identical outputs
- Language: Supports pure Hindi, pure English, and code-mixed text
- Numbers: Automatically normalized (e.g., “2024” → “two thousand twenty four”)
Language Features
- ✅ Full Devanagari script support
- ✅ Natural English pronunciation
- ✅ Seamless Hindi-English code-mixing
- ✅ Automatic number/date normalization
- ✅ Proper punctuation handling
Output Formats
- WAV: Uncompressed, highest quality (default)
- Opus: Compressed, excellent for streaming
- WebM: Compressed, web-optimized