HeyGen Avatar V

Generate realistic talking avatar videos from text using HeyGen’s Avatar V engine — the newest generation of HeyGen’s avatar engine, featuring cross-reference-driven animation that analyzes the avatar and audio together for more natural motion and lip-sync than Avatar IV.

When to use Avatar V vs Avatar IV

Avatar V (this model) — Newer engine, better motion quality, more coherent expressions, especially on longer scripts. Only works with avatars trained for Avatar V.
Avatar IV — Works with any HeyGen avatar including arbitrary photos. Use this if your avatar isn’t Avatar V-eligible, or if you need motion_prompt/expressiveness controls.

Inputs

avatar_id — The avatar to use. Must support Avatar V — not every avatar does. Inspect supported_api_engines on the look via GET /v3/avatars/looks/{id} before requesting it.
input_text — The script the avatar will speak (up to 5,000 characters).
voice_id — The voice to use. Get IDs from HeyGen’s List All Voices API.
resolution — Output resolution: 720p, 1080p, or 4k.
aspect_ratio — 16:9 (landscape) or 9:16 (portrait).
voice_speed — Speech speed from 0.5× to 1.5×.
caption — Whether to burn captions into the video.
title — Optional video title shown in your HeyGen dashboard.

Output

Returns an MP4 video of the avatar speaking the provided text.

Use cases

Marketing — Personalized video ads with high-quality avatars.
Education — Multilingual course videos with consistent presenters.
Sales — Personalized outreach videos at scale.
Social media — Talking-head content without a camera.

Links

Model created 2 months, 1 week ago