bitflow/kinetic-captions

Generate dynamic, stylish captions and hard-burn them into videos using ASS subtitles and FFmpeg.

Public
16 runs

Run time and cost

This model runs on CPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

🎬 Kinetic Captions (Cog)

Generate dynamic, stylish captions and hard-burn them into videos using ASS subtitles and FFmpeg.

Designed for both: - 🎉 Lively social media content (TikTok / Reels style) - 🧊 Gentle informational / documentary-style videos


✨ Features

  • 🎬 Hard-burn captions directly into video (no external subtitle dependency)
  • 🎨 Two caption styles:
  • lively (energetic, social media style)
  • gentle (clean, minimal, editorial style)
  • 📐 Resolution-independent dynamic sizing
  • 🧠 Automatic line breaking and layout handling
  • 🔤 Language-aware font routing (auto, ko, en)
  • ✍️ Inline emphasis support:
  • plain text → Regular
  • *text* → Medium
  • **text** → SemiBold
  • ⚡ ASS-based rendering for rich styling and flexibility

🧩 Input Parameters

Required

  • video_file (Path): source video file
  • One subtitle source is required:
  • subtitle_file_url (String): public .srt/.vtt URL or local path
  • subtitle_text (String): pasted subtitle content (SRT/VTT)

Optional

  • aspect_ratio (String, default: "9:16"): e.g. "9:16", "16:9", "1:1", "4:5"
  • font_style (Enum, default: "gentle"): "gentle" or "lively"
  • subtitle_language (Enum, default: "auto"): "auto", "ko", "en"
  • In auto, if a line contains any Korean character, Korean font is used for the full line.
  • text_color (Enum, default: "white"): "white", "yellow", "red"
  • white + black outline (standard)
  • yellow + black outline (highlight look)
  • red + white outline (action look)

Replicate UI Note

  • Some Replicate web UIs may not accept direct .srt/.vtt uploads as file inputs.
  • Use subtitle_file_url (recommended) or paste subtitles via subtitle_text.

🧠 Core Strategy

Instead of relying on complex FFmpeg filters, this project uses:

ASS (Advanced SubStation Alpha) as a styling bridge

Why ASS?

  • 📏 Dynamic font scaling (resolution independent)
  • 🎨 Rich styling (outline, shadow, background box)
  • 📍 Easy alignment (9-grid positioning)
  • 🔤 Advanced text control (line breaks, color per word)

⚙️ Workflow Overview

Step 1: Subtitle Parsing

  • Read subtitle from subtitle_file_url or subtitle_text
  • Convert into structured subtitle events

Step 2: ASS Generation

  • Apply dynamic font sizing based on aspect ratio
  • Insert line breaks automatically
  • Apply style (gentle / lively)
  • Apply language mode (auto / ko / en)
  • Apply emphasis markers (* / **)

Step 3: Hard-burn with FFmpeg

ffmpeg -i input.mp4 -vf "ass=subtitle.ass" -c:a copy output.mp4

🎨 Caption Styles

🧊 Gentle (Editorial / Documentary)

  • Font: clean sans-serif (e.g. Pretendard / Inter)
  • Weight: Regular / Medium / SemiBold
  • Subtle background box
  • Minimal animation feel

🎉 Lively (TikTok / Reels)

  • Bold, high-contrast captions
  • Thick outline styling (no zero-outline mode)
  • Strong emphasis on keywords
  • Designed for engagement and readability
  • Suggested color combos:
  • white text + black outline
  • yellow text + black outline
  • red text + white outline

🔤 Font Mapping

  • Gentle KO: Pretendard Variable
  • Gentle EN: Inter Variable
  • Lively KO: Cafe24 Ssurround Bold (Cafe24Ssurround-v2.0.ttf)
  • Lively EN: Cooper Std Black
Model created
Model updated