How to prompt Grok Imagine Video 1.5
Grok Imagine Video 1.5
We’re particularly excited about Grok Imagine Video 1.5. It’s a big jump with better aesthetic precision and physics adherence. We ran a bunch of prompts through it to see what it could actually do, and we also put together a prompting guide to help you understand how to get the most out of this model.
Video examples
Hong Kong, 2am
The woman’s gaze drops slowly from the camera to the pavement in front of her. Rain continues to fall heavily all around, streaking past the neon signs. The red and green neon reflections ripple in the wet street. The hem of her dress shifts slightly in a warm gust. Sound: heavy rain drumming on pavement and corrugated metal awnings, the low buzz of neon signs, a distant scooter engine fading away, the hiss of tires on a wet road.
Cloud forest, Costa Rica
The panther stands completely still on the mossy log, its amber eyes locked directly on the camera. Its tail sweeps slowly once to the right. The shaft of light from above shifts slightly, the mist swirling around it. The panther’s nostrils flare once, then its ears rotate forward. Sound: deep rainforest quiet, the distant drip of water on broad leaves, a barely-audible rumble from deep in the cat’s chest, a single bird call high in the canopy.
Close-up, firelight
A slow warm breeze moves through the frame, lifting several strands of hair and drifting them across her cheek, then settling. The firelight on her skin breathes and flickers, casting shifting shadows across her brow. Her expression stays completely still. Sound: the soft crackle of burning wood just out of frame, a slow exhale, the distant low moan of wind outside.
Candlelight
The three candle flames flutter in a slow breath of air that passes through the room, bending left then settling upright again. The warm amber light on the linen shifts and pulses with each flicker. A thin trail of smoke rises from the wick of the left candle. Wax begins to pool and run slowly down the side of the center candle. The wine glasses catch each flicker as a tiny moving reflection. Sound: the deep silence of a formal room, the faint tick of a clock somewhere off-frame, the soft crackle of burning wick, a barely audible breath.
Iceland from above
Slow aerial push-in toward the tiny red figure standing at the edge of the turquoise glacial river. The braided water flows in slow swirls across the black sand. The camera drifts gently to the left as it descends. The figure raises a hand to shield their eyes against the sun. A thin veil of mist drifts past the lens. Sound: the muffled chop of distant helicopter rotors heard from inside a cabin, the rush of high-altitude glacial wind, the faint roar of meltwater rivers below, the headset breathing of a pilot.
Person on a motorcycle
The superbike continues leaning hard through the corner at full speed, the knee slider scraping asphalt and throwing a long trail of orange sparks behind it. The stone walls blur past in a torrent of motion. The bike’s exhaust pops twice as the rider shifts mid-corner. The rider’s helmet stays locked on the apex. Sound: the screaming high-pitched wail of a 1000cc superbike engine at 13,000 RPM, the metallic scrape of titanium slider on asphalt, the doppler-shift roar as the bike rockets past the camera, the throaty pop of an aftermarket exhaust on overrun.
Breaking wave
The wave crests fully and pitches forward, the translucent green crest folding and crashing down onto the dark rocks with tremendous force. White foam explodes upward and outward, hanging for a moment before collapsing back. Sea spray drifts across the frame in the dawn wind. The water rushes back off the rocks in white rivulets. A second smaller wave rises behind. Sound: the deep boom of a heavy swell hitting rock, the hiss and rush of water pulling back across stone, the low moan of wind across an open coastline, sea spray on a microphone.
Dawn run, Bangkok
The camera tracks alongside the runners in Bangkok as they continue sprinting, all four pumping their arms and legs in perfect lockstep, breath visible in the cool morning air. The lead runner glances briefly at the camera, then snaps his focus back ahead. The shopfront shutters, parked motorbikes, and pedestrians on the curb streak past in heavy horizontal motion blur. Sound: the rhythmic slap of running shoes on pavement, heavy synchronized breathing, the rumble of a distant scooter, the muffled chatter of an early morning street market.
Quiet afternoon
The figure slowly lowers the phone from their ear, exhales, and lets their hand fall to their side. They turn their head almost imperceptibly toward the room. Dust motes drift through the shaft of golden light. The cat lifts its head, ears swivelling. The CRT television flickers once. The sheer curtains stir in a slow breeze. Sound: distant city traffic muffled through glass, a kitchen tap dripping somewhere off-screen, the soft hum of the old TV, the creak of a wood floor.
Hopefully, these give a good sense of just how far you can push Grok Imagine 1.5.
How to prompt it
After experimenting with Grok Imagine 1.5 quite a bit, we came up with the following prompting tips that can really elevate your outputs.
Write the Sound: section like a sound designer
Every example above has an explicit Sound: section. Signaling this to the model and describe how you want sound to be designed in your video can make or break the final delivery.
Vague: Sound: city sounds, rain.
Specific: Sound: heavy rain drumming on corrugated metal awnings, the low buzz of neon sign transformers, a distant scooter fading away, the hiss of tires on wet road.
It knows the difference between rain on pavement and rain on metal. You can be as granular as you want, and it will keep up.
A few things that work particularly well: “heard from inside a cabin,” “sea spray on a microphone,” “the headset breathing of a pilot,” “muffled through glass.” These are all spatial and material cues that tell the model what is needed to craft a great soundscape.
Use intensity modifiers
Without them, the model picks its own interpretation of scale. “The wave crests” is ambiguous. “The wave crests fully and pitches forward, crashing down with tremendous force” is much more indicative.
The motorcycle scene works, for instance, because of “screaming high-pitched wail,” “long trail of orange sparks,” and “rockets past the camera.” Remove those words and you get a duller clip.
Describe camera movement
The model holds static if you don’t ask for movement which is generally the right call if you don’t specify anything. A locked camera with patient motion reads more cinematic than unnecessary moves. But when you want a certain camera move, be sure to stipulate that.
Things that work: slow push-in, aerial push-in toward, camera drifts gently to the left, tracking shot alongside, locked, static. The Iceland clip asks for “slow aerial push-in” and “camera drifts gently to the left as it descends.”
Keep it focused
The model handles focused prompts better than sprawling ones. The eye scene is three sentences: breeze moves hair, light flickers, expression stays still. The candle scene gives each candle its own micro-action. You can really hone in on certain objects while keeping other elements of your composition still or faded out.
Starting with the image
The best way to use Video 1.5 is to start with a still you’ve already dialed in. Use any image generator, like Grok Imagine Image, or your own photo to nail the composition and lighting first. Once the frame looks right, the video prompt only needs to say what changes.
Iridescent form
Starting image:
Abstract 3D render of a large glossy morphic form — smooth curved surfaces of transparent glass or liquid chrome, refracting prismatic iridescent color bands of cyan, magenta, gold, and electric blue against a pure black background. Hyperreal studio lighting, physically accurate reflections and refractions.

Then passed to Video 1.5:
The glossy morphic form slowly undulates and breathes, its surfaces shifting like liquid mercury. The prismatic iridescent bands — cyan, magenta, gold, electric blue — flow and ripple across the curves as the shape subtly deforms and reforms. The light refracts differently as the surface tension shifts. The form rotates almost imperceptibly. Sound: a deep resonant hum, like the inside of a seashell, the faint crystalline ring of glass under tension, slow and meditative.
Wabi-sabi interior
Starting image:
A minimalist Belgian wabi-sabi interior. A long low linen-upholstered sofa in a sandy oatmeal tone sits against a tactile cream lime-plaster wall. A single rough-hewn dark walnut coffee table sits in front of it on a polished concrete floor. On a built-in concrete plinth: a squat ceramic table lamp with a dark earth-brown clay base and a soft cream linen shade, casting a low warm glow. A heavy linen throw drapes asymmetrically across the sofa. No decoration, no clutter, no pattern. The architectural photography of Vincent Van Duysen and Axel Vervoordt.

Then passed to Video 1.5:
The afternoon sunlight coming through an unseen window slowly shifts and dims as time passes. The shaft of warm golden light that falls across the linen sofa and concrete floor moves gradually to the right and narrows, the color shifting from warm amber to cooler blue as the hour advances toward evening. The lamp’s warm glow becomes more pronounced as the room darkens. Shadows deepen in the corners. Sound: deep interior quiet, the barely audible ambient hum of the city outside, a building settling in the cooling air.
The still handles composition and color while the video prompt handles motion. Keeping them separate can make both easier to iterate on.
Run it on Replicate
import replicate
output = replicate.run(
"xai/grok-imagine-video-1.5",
input={
"prompt": "The fisherman slowly turns his head to look out across the open Atlantic. Sound: the lap of cold ocean water against the wooden hull, distant gulls, the creak of a wooden boat.",
"image": "https://example.com/fisherman.png",
"duration": 8,
"resolution": "720p"
}
)
print(output) import Replicate from "replicate";
const replicate = new Replicate();
const output = await replicate.run(
"xai/grok-imagine-video-1.5",
{
input: {
prompt: "The fisherman slowly turns his head to look out across the open Atlantic. Sound: the lap of cold ocean water against the wooden hull, distant gulls, the creak of a wooden boat.",
image: "https://example.com/fisherman.png",
duration: 8,
resolution: "720p",
}
}
);
console.log(output); Grok Imagine Video 1.5