Readme
HeyGen Lipsync Precision
Replace or dub audio on an existing video with high-accuracy, avatar-inference lip sync. Give it a video and new audio, and it produces precise lip movements that closely match the new audio.
Overview
Lipsync Precision uses avatar inference to achieve higher-quality lip synchronization compared to the speed mode. It analyzes the speaker’s face in detail and generates more natural, accurate lip movements. Best for final production content where quality matters.
How it works
- Upload a source video containing a speaking person
- Upload replacement audio (new dialogue, translated speech, etc.)
- The model uses avatar inference to precisely re-animate the speaker’s lips
- Get back a video with high-fidelity synchronized lip movements
Use cases
- Professional dubbing: High-quality lip sync for film, TV, and streaming content
- Marketing videos: Polished localized videos for brand campaigns
- Training content: Professional-grade translated training materials
- Product demos: Re-voice product videos with precise lip sync
Options
- Dynamic duration: Let the output video length adjust to match the new audio (on by default)
- Music removal: Strip background music from the source video
- Speech enhancement: Improve speech clarity in the output
Pricing
Billed per second of output video at $0.0667/second.
For faster processing with audio-only lip sync, see heygen/lipsync-speed.