google/veo-3.1-fast

New and improved version of Veo 3 Fast, with higher-fidelity video, context-aware audio and last frame support

500.9K runs

Veo 3.1 Fast

A faster version of Google’s Veo 3.1 video generation model. Veo 3.1 Fast creates high-quality videos with synchronized native audio from text prompts or images, optimized for faster generation times.

For higher quality output with longer generation times, see google/veo-3.1.

Key features

Synchronized audio generation – Generates rich native audio automatically, including natural conversations, sound effects, and ambient soundscapes, all synchronized with the video content.

Image-to-video – Transform static images into dynamic videos with strong prompt adherence and visual quality.

Reference image support – Upload up to 3 reference images to guide appearance, style, and character consistency across generated video.

Frame-to-frame generation – Provide a starting and ending frame, and the model generates smooth transitions between them.

Multiple output formats – Generate videos at 720p or 1080p resolution at 24 FPS, in both landscape (16:9) and portrait (9:16) aspect ratios. Choose from 4, 6, or 8-second durations.

Faster generation – Optimized for speed while maintaining high visual quality, making it a good fit for rapid iteration and experimentation.

Tips

Be specific in your prompts – Include details about camera angles, lighting, mood, and any audio elements you want. For example: “A medium shot of a wise owl circling above a moonlit forest clearing, with wings flapping sounds and a gentle orchestral score.”

Use reference images – For character or style consistency, choose clear, well-lit images that show the subject from the desired angle.

Image-to-video – Use high-quality input images with clear subjects. Describe the motion and action you want, not just what’s already in the image.

Audio guidance – Describe desired sounds in your prompt using descriptions like “with bird songs and wind rustling” or “accompanied by upbeat music.”

About Veo 3.1

Veo 3.1 builds on Google’s Veo 3 foundation with improvements in prompt adherence and audiovisual quality, particularly for image-to-video generation. All videos are marked with SynthID, Google’s watermarking technology for identifying AI-generated content.

Learn more

For detailed API documentation, visit Google’s Gemini API documentation.

Model created
Model updated