Readme

HeyGen Lipsync Speed

Replace or dub audio on an existing video with fast, audio-driven lip sync. Give it a video and new audio, and it re-animates the speaker’s lip movements to match.

Overview

Lipsync Speed is optimized for fast turnaround. It uses audio-only analysis to re-sync lip movements, making it ideal for quick dubbing, voiceover replacement, and content localization workflows where speed matters more than pixel-perfect accuracy.

How it works

Upload a source video containing a speaking person
Upload replacement audio (new dialogue, translated speech, etc.)
The model re-animates the speaker’s lips to match the new audio
Get back a video with synchronized lip movements

Use cases

Dubbing: Replace dialogue in videos with different languages
Voiceover replacement: Swap narration while keeping the visual in sync
Content localization: Quickly adapt videos for different markets
Audio correction: Fix audio issues while maintaining lip sync

Options

Dynamic duration: Let the output video length adjust to match the new audio (on by default)
Music removal: Strip background music from the source video
Speech enhancement: Improve speech clarity in the output

Pricing

Billed per second of output video at $0.0333/second.

For higher-quality lip sync with avatar inference, see heygen/lipsync-precision.

Model created 2 months, 3 weeks ago

Examples