Readme
βοΈβπ₯ Whisper Unchained
Break free from: - π Token requirements β Zero setup - π File size limits β 1 GB supported - π English-only translation β 37 languages - π§ Multiple API calls β One call. Everything.
Unchained from limitations. Unleashed for production.
βοΈ What Youβre Chained To (With Other Models)
| Limitation | Them (Chained) | Whisper Unchained |
|---|---|---|
| File Size | ~100 MB max π | 1 GB βοΈβπ₯ |
| Translation | English only π | 37 languages βοΈβπ₯ |
| Setup | Token required π | Zero setup βοΈβπ₯ |
| Formats | JSON only π | JSON + Text + SRT + VTT βοΈβπ₯ |
| Complexity | 13+ parameters π | 3 simple βοΈβπ₯ |
| Cost | Multiple APIs π | One call βοΈβπ₯ |
βοΈβπ₯ Breaking Free From BS
π Unchained from Token Hell
- Chained models: βSign up. Accept terms. Get HuggingFace token. Configure permissions.β
- Unchained: Paste audio URL. Done.
π Unchained from English-Only
- Chained models: βTranslate to English!β
- Unchained: Spanish β Japanese. French β Arabic. ANY β ANY (37 languages).
π¦ Unchained from Format Juggling
- Chained models: βHereβs JSON. Go convert it yourself.β
- Unchained: JSON + SRT + VTT auto-generated. Every. Single. Time.
π Unchained from File Limits
- Chained models: βSplit your 500 MB podcast into 5 parts.β
- Unchained: Upload the whole 1 GB file. We handle it.
π What You Actually Get
Every single API call returns:
{
"transcript": "Full text transcription",
"translation": "Translated to your target language",
"language": "Auto-detected source language",
"duration": "Audio length in seconds",
"json": "Structured data with segments, words, speakers",
"srt": "Ready-to-use video subtitles",
"vtt": "Ready-to-use web subtitles"
}
7 outputs. 1 API call. Zero extra work.
πͺ Real Power Features
π True Multi-Language Translation
- Them: βWe translate to English!β
- Us: Translate ANY language to ANY of 37 languages (DeepL powered)
- Spanish β Japanese? β French β Arabic? β German β Korean? β
π Enterprise File Sizes
- Them: Split your 500 MB podcast into 5 parts
- Us: Upload the whole 1 GB file. We handle it.
π€ Unlimited Speaker Detection
- Them: βSpecify min/max speakersβ
- Us: Auto-detects as many speakers as exist
π― 118 Languages Supported
- Full Whisper language coverage
- Auto-detection when you donβt know the language
- Works on ANY audio content
β‘ 3 Parameters. Thatβs It.
{
"audio_url": "https://your-file.mp3", # Required
"language": "Auto-detect", # Optional
"translate_to": "Spanish" # Optional
}
No batch_size. No vad_onset. No temperature. No HuggingFace tokens.
Just the essentials.
π― Perfect For
Content Creators - Transcribe + translate + subtitle your videos in ONE call - No more exporting to 3 different services
Podcasters - 1 GB file support = full episodes, no splitting - Speaker diarization included, not extra
Businesses - Meeting transcripts with speaker labels - Translate to teamβs languages automatically
Developers - 1 endpoint replaces 3+ services - Clean API, zero token management
π° Stop Paying for Chains
Chained to multiple services: 1. Transcription API: $$ 2. Translation API: $$ 3. Subtitle converter: $$ 4. Large file storage: $$$ 5. HuggingFace subscription: $$
Total: πΈπΈπΈ
Unchained: 1. One API call: Everything β
Total: πΈ
β‘ Break Free. Start Now.
No setup. No tokens. No limits.
{
"audio_url": "https://your-1gb-file.mp3",
"language": "Auto-detect",
"translate_to": "Spanish"
}
Output: Transcription + Translation + JSON + SRT + VTT + Speakers
One call. Unchained.
βοΈβπ₯ Break free from limitations | Built by SIΓN Agency