Readme

Vidu Q3 Turbo

Vidu Q3 Turbo generates video from text prompts, images, or a combination of both. It’s the faster variant of the Q3 series, optimized for quick iteration while still producing high-quality clips up to 16 seconds at up to 1080p with optional synchronized audio.

For maximum visual fidelity, use Vidu Q3 Pro. For faster generation at a lower price, use Q3 Turbo.

What it does

Vidu Q3 Turbo creates video in three modes, chosen automatically based on your inputs:

Text to video: Describe a scene and the model generates it
Image to video: Upload a starting image and a prompt describing the motion
Start-end to video: Upload both a starting and ending frame, and the model creates a smooth transition between them

The model produces natural-looking motion and camera movements with good temporal consistency. When audio is enabled, it generates synchronized sound effects, dialogue, and ambient audio.

How to use it

Text to video

Provide a prompt describing your scene. Use aspect_ratio to control the framing.

Image to video

Upload a start_image along with a prompt describing what should happen. The model animates your image into video. Supported formats: PNG, JPEG, WebP.

Start-end to video

Upload both start_image and end_image with a prompt. The model generates a video that transitions smoothly from the first frame to the last. Both images should have similar aspect ratios.

Writing effective prompts

Be specific about motion: “A woman in a red coat walks through falling snow” works better than “a person outside”
Describe camera movement if you want it: “slow dolly shot”, “aerial view pulling back”
For audio, describe sounds explicitly: “birds chirping”, “footsteps on gravel”

Parameters

prompt: Text description of the video (up to 5,000 characters)
start_image: Starting frame image (enables image-to-video mode)
end_image: Ending frame image (requires start_image, enables start-end mode)
duration: Video length in seconds (1–16, default: 5)
resolution: Output resolution — 540p, 720p, or 1080p (default: 720p)
aspect_ratio: 16:9, 9:16, 3:4, 4:3, or 1:1 (text-to-video only, default: 16:9)
audio: Generate synchronized audio (default: true)
seed: Random seed for reproducible results

Pricing

Billed per second of video output, based on resolution:

Resolution	Price per second
540p	$0.04
720p	$0.06
1080p	$0.08

For example, a 5-second video at 720p costs $0.30.

Q3 Pro vs Q3 Turbo

	Q3 Pro	Q3 Turbo
Visual fidelity	Higher	Good
Generation speed	Slower	Faster
Price (720p)	$0.15/sec	$0.06/sec
Best for	Final renders, high-quality content	Rapid prototyping, iteration, high-volume

What it’s good for

Rapid prototyping: Quickly test ideas before committing to a Pro render
Social media content: Generate short-form video at scale
Marketing and advertising: Create video content from text or product images
Animation: Bring still images to life with natural motion
Scene transitions: Use start-end mode to create smooth visual bridges between keyframes

Limitations

Maximum 16 seconds per generation
Audio generation adds dialogue and sound effects but doesn’t support background music control
Complex text rendering within the video may not be reliable
Very rapid fine-grained hand movements can sometimes look unnatural

Links

Model created 4 months, 1 week ago

Model updated 3 months ago

Examples