How to prompt Veo 3 with images

Posted August 1, 2025 by
Ready to test?

Try Veo 3

Image input in Veo 3 has been highly anticipated, and it’s now on Replicate. Here are some of the coolest and most useful tricks we discovered.

Style preservation

The biggest appeal of image input with Veo 3 is being able to animate your images while preserving their unique visual style. Whether it’s a cartoon, painting, or photograph, Veo 3 maintains the artistic integrity of your original image throughout the video.

The fire in the room begins to burn. Maintain the style of the image.

Veo 3 banner
Input image: “This is fine” meme
Generated animation

In this example, we just told Veo 3 to keep the style the same with no constraints on the actual action or direction of the video. The model still does a great job at creating interesting motion and camera movement while preserving your image’s style:

Keep the style the same

Stylized Tiger
Input image
Motion and style preserved in animation.

Here, we gave some explicit action and camera directions. In this case, Veo 3 quite decently retains style, even when handling other dynamic elements of the prompt.

The man is running intensely away from a threat through wild, alien-like shrubbery. He says to his microphone, “This is Echo 1. I’m being pursued.” The camera swivels out from the man to reveal the jungle terrain. Maintain the animation style of the original image.

Running input image
Input image
Motion and style preserved in animation.

This goes beyond animation — check out how Veo 3 retains the filtering and color grading of the input image here.

The man rows the boat. Maintain the vintage feel of the image.

Rowing input image
Input image
Color graded video output

The quality of style preservation in Veo 3 is remarkable — from cartoon aesthetics to photographic color grading, the model maintains the visual characteristics that make your original image unique.

While occasionally you might need to explicitly prompt Veo 3 to “maintain the style of the input image” or describe specific style qualities in your prompt, for the most part it preserves visual style exceptionally well without additional guidance.

Typography

Veo 3 is really good at handling text and typography in images, and it comes up with some pretty wild animations. This makes it perfect for creating eye-catching ads. Check out this example where the text stays intact and gets animated smoothly:

The text swirls in as cremé-colored ribbons, beautifully spelling out “Build with Replicate”

Build with Replicate input image
Input image: Build with Replicate
Generated animation

Notice that the input image will always be the starting frame of your video. The example above calls for an animation of free-flowing ribbons to form the “Build with Replicate” text, yet the video must begin with the input image appearing first. You can simply crop out this initial keyframe if needed.

Even when the background is complex and dynamic, Veo 3 keeps text sharp and readable. Here’s an example where the typography stays crisp despite the intricate animation happening behind it:

A gradual zoom-out reveals the girl is standing next to a carbon copy of herself. Maintain the film quality of the original image. Maintain the text “replicate” on the screen for the first few seconds.

A “replicated” girl.

You can even incorporate both complex videography and text transitions, like how we did with this swirling animation of “Run Veo 3” over a cinematic running shot. We made this first image with Ideogram 3.0, by the way.

A grainy, vintage-style action shot of a man sprinting through a dense urban area. The scene is shot with a handheld camera feel—shaky, cinematic motion blur, and soft film grain. The words “Run VEO 3” swirl dynamically out of the environment. The lighting is warm, low-contrast golden hour with visible lens flares and deep shadows. Maintain the old indie film or VHS tape aesthetic.

Run Veo 3 input image
Input image
Text animated + style preserved

The model keeps the text legible and integrates it naturally into the animation.

More creative control

Although Veo 3 can generate impressive videos from text prompts alone, using an image as input gives you significantly more creative control. When you generate a video directly from a text prompt, you might not get exactly the style, composition, or mood you’re looking for. By first creating an image with your desired aesthetic using a specialized image model, then feeding that to Veo 3, you can achieve much more precise results.

For example, we used Ideogram 3.0 to generate an image with a specific Studio Ghibli aesthetic, capturing that distinctive art style perfectly.

Elephant input image
Input image

Then we fed this carefully crafted image to Veo 3, which preserved the style while bringing it to life with animation.

All we had to tell Veo 3 was…

Make him run!

”Studio Ghibli” style preserved

This means you can offload all the stylistic efforts to your image generations, leaving Veo 3 to expertly handle action and camera movements to get precisely the video you need.

A gradual zoom-out of a bee, elegantly floating above the metal pipe, fiddling at it with its arms. The bee then flies off camera.

Bee input image
Input image
Style of bee preserved.

One of the most exciting features is the ability to animate only certain parts of an image, leaving the rest untouched. This allows for subtle, cinematic effects. For example, see how only the shoe is animated while the background remains static:

Rotate the shoe, keep everything else still.

MacPaint shoe input image
Input image
”Rotate the shoe only”

You can use this technique to draw attention to specific elements or create dynamic scenes from still images.

Another fascinating capability is Veo 3’s ability to generate hyperrealistic nature documentary-style footage from scratch. We had a lot of fun creating these cinematic shots that look like they could be straight from a National Geographic special:

The chameleon looks around.

Hyperrealistic chameleon

The tiger emerges from the mossy pond, treading through the water to find something to eat.

Documentary-style tiger footage

We upscaled these videos with Topaz’s model.

The level of detail and realism in these text-to-video generations is remarkable – from the chameleon’s subtle eye movements to the tiger’s natural breathing patterns, Veo 3 can breathe life into your favorite of wildlife shots.

It’s your turn!

Here are a few ideas to get your juices flowing with Veo 3:

  • Create a surreal portrait where the subject’s tattoos come alive and flow across their skin
  • Transform a vintage photograph into a cinemagraph where only the city lights twinkle
  • Bring a Renaissance painting to life with floating cherubs and swirling celestial objects
  • Animate your company logo with complex typography transitions
  • Create a Wes Anderson-esque landscape film
  • Use photos of your pets to turn them into rockstars, like this:
Who knew dogs could drum? 🥁
Ready to start creating?

Try Veo 3


Share your creations with us on Discord or X – we’d love to see what you make!