Readme
Wan 2.7 VideoEdit
Wan 2.7 VideoEdit is an instruction-based video editing model from Alibaba’s Wan family. Give it a video and describe what you want changed — swap backgrounds, shift lighting, adjust colors, change clothing, or apply style transfers — and it edits the video while preserving the original motion and structure.
How it works
Unlike text-to-video generation, which creates footage from scratch, VideoEdit modifies an existing clip based on your instructions. The model processes your source video alongside a natural language editing command and outputs a new video with the requested changes applied.
This makes it useful for iterating on existing footage without starting over — you can refine a clip step by step instead of regenerating it each time.
Inputs
- video — Source video to edit (mp4 or mov, 2–10 seconds)
- prompt — Editing instructions describing what to change (e.g. “change the background to a snowy mountain landscape” or “make the person wear a red dress”)
- reference_image — Optional reference image to guide the edit (e.g. a target style or look)
- resolution — Output resolution: 720p or 1080p (default: 1080p)
- aspect_ratio — Output aspect ratio: auto, 16:9, 9:16, 1:1, 4:3, or 3:4 (default: auto, matches input)
- audio_setting — Audio behavior: “auto” lets the model decide whether to regenerate audio, “origin” keeps the original audio track
- duration — Output duration in seconds (2–10). If not set, matches the input video length
- seed — Random seed for reproducible results
What you can edit
- Backgrounds — swap indoor to outdoor, change scenery, add weather effects
- Lighting — shift from golden hour to blue hour, add dramatic lighting
- Colors and styles — change color palettes, apply artistic styles
- Clothing and appearance — modify outfits, accessories, or visual details
- Scene elements — add or remove objects, adjust the environment
Tips for good results
- Be specific. “Change the jacket to a leather vest” works better than “change the outfit.”
- Keep edits focused. One clear change per instruction produces cleaner results than trying to change everything at once.
- Use short clips. 2–5 second clips tend to produce the cleanest edits.
- Use reference images when you have a specific target look in mind — they give the model a concrete visual anchor.
Limitations
- Complex spatial rearrangements (moving objects to different positions) may not work well
- Detailed facial feature changes can be inconsistent
- Physics-based changes (like changing the direction of gravity) aren’t supported
- Input videos longer than 10 seconds need to be trimmed first
Try it out on the Replicate playground.