mayaresearch/veena-max | Readme and Docs

VeenaMAX is a high-performance, advanced Text-to-Speech API that converts text into natural-sounding speech with emotional intelligence and blazing-fast response times. It can speak English and Hindi - the most widely used languages in India. As a signifi

Public

102 runs

License

Weights

Playground API Examples README Versions

State-of-the-art TTS system for high-quality Hindi and bilingual (Hindi-English) speech synthesis. Built to deliver natural-sounding voices with emotional depth and clarity.

Available Voices

Female Voices

vinaya_assist - Professional assistant voice (default)
charu_soft - Gentle storyteller’s voice
keerti_joy - High-energy, vibrant tone
mohini_whispers - Whisper-soft, ASMR optimized
maitri_connect - Expressive narration voice
soumya_calm - Smooth, soothing voice

Male Voices

varun_chat - Deep, authoritative voice
agastya_impact - Bold, confident presence

Parameters

Parameter	Default	Range	Description
`text`	(required)	Min 30 chars	Text to convert to speech
`speaker`	`vinaya_assist`	See voices above	Voice selection
`temperature`	`0.5`	0.0-2.0	Controls randomness
`repetition_penalty`	`1.1`	1.0-2.0	Reduces repetition
`top_p`	`0.9`	0.0-1.0	Nucleus sampling
`max_new_tokens`	`800`	100-4096	Token limit
`normalize_text`	`true`	true/false	Auto text normalization
`seed`	`null`	Any integer	For reproducible output
`output_format`	`wav`	wav/opus/webm	Audio format

Use Cases

Audiobooks & Narration - Use charu_soft for storytelling
Virtual Assistants - Use vinaya_assist for professional responses
News & Announcements - Use varun_chat for authoritative delivery
ASMR & Meditation - Use mohini_whispers for calming content
Educational Content - Use keerti_joy for engaging presentations
Corporate Presentations - Use agastya_impact for impactful delivery

Tips

Text Length: Optimal 50-500 characters per request
Consistency: Use temperature 0.2-0.4 for consistent output
Reproducibility: Set a seed value for identical outputs
Language: Supports pure Hindi, pure English, and code-mixed text
Numbers: Automatically normalized (e.g., “2024” → “two thousand twenty four”)

Language Features

✅ Full Devanagari script support
✅ Natural English pronunciation
✅ Seamless Hindi-English code-mixing
✅ Automatic number/date normalization
✅ Proper punctuation handling

Output Formats

WAV: Uncompressed, highest quality (default)
Opus: Compressed, excellent for streaming
WebM: Compressed, web-optimized

Model created 10 months, 1 week ago