Readme

HeyGen Lipsync Precision

Replace or dub audio on an existing video with high-accuracy, avatar-inference lip sync. Give it a video and new audio, and it produces precise lip movements that closely match the new audio.

Overview

Lipsync Precision uses avatar inference to achieve higher-quality lip synchronization compared to the speed mode. It analyzes the speaker’s face in detail and generates more natural, accurate lip movements. Best for final production content where quality matters.

How it works

Upload a source video containing a speaking person
Upload replacement audio (new dialogue, translated speech, etc.)
The model uses avatar inference to precisely re-animate the speaker’s lips
Get back a video with high-fidelity synchronized lip movements

Use cases

Professional dubbing: High-quality lip sync for film, TV, and streaming content
Marketing videos: Polished localized videos for brand campaigns
Training content: Professional-grade translated training materials
Product demos: Re-voice product videos with precise lip sync

Options

Dynamic duration: Let the output video length adjust to match the new audio (on by default)
Music removal: Strip background music from the source video
Speech enhancement: Improve speech clarity in the output

Pricing

Billed per second of output video at $0.0667/second.

For faster processing with audio-only lip sync, see heygen/lipsync-speed.

Model created 2 months, 3 weeks ago

Examples