Readme
PixVerse V6
PixVerse V6 is the latest video generation model from PixVerse, ranking among the top video models globally alongside Seedance 2.0. It generates up to 15 seconds of 1080p video with synchronized audio.
What V6 is good at
- Cinematic camera control. Smooth pans, tilts, zooms, and tracking shots with natural perspective shifts.
- Character emotion. Nuanced facial expressions and body language with consistent character-environment interaction.
- Dynamic physics. Object interactions adhere to real-world physics with believable collision feedback.
- Native text generation. Integrates English, Chinese, and other languages into scenes with sharp typography and high-precision positioning.
- First-person POV. High-speed motion from immersive first-person perspectives.
- Multi-shot short films. Generate multi-shot sequences in a single run using
generate_multi_clip_switch.
Modes
- Text-to-video: Provide a
promptand anaspect_ratio. - Image-to-video: Provide a
promptand animage. - Transition (first/last frame): Provide a
prompt,image(first frame), andlast_frame_image(last frame).
Audio
Set generate_audio_switch: true to generate synchronized audio alongside the video, including background music, sound effects, and character dialogue. Audio generation adds to the per-second price.
Multi-shot mode
Set generate_multi_clip_switch: true to generate cinematic sequences with multiple shots and scene transitions. Works with text-to-video and image-to-video. Write your prompt as a series of shots:
Shot 1, wide shot of a hiker on a snowy mountain. Shot 2, close-up of their determined face. Shot 3, aerial view of the summit.
Pricing
Billing is per output second, tiered by resolution and audio.
| Resolution | No audio | With audio |
|---|---|---|
| 360p | $0.05/s | $0.07/s |
| 540p | $0.07/s | $0.09/s |
| 720p | $0.09/s | $0.12/s |
| 1080p | $0.18/s | $0.23/s |
A 5-second 720p video with audio costs $0.60. A 10-second 1080p video without audio costs $1.80.