Wan 2.7 text-to-video

Wan 2.7 is a text-to-video model from Alibaba’s Wan family. Describe a scene in natural language and it generates a video with coherent motion, lighting, and synchronized audio.

The model is built on a 27 billion parameter Mixture-of-Experts architecture and generates up to 1080p video at durations from 2 to 15 seconds. It auto-generates matching audio (sound effects, ambient noise) or you can provide your own audio file for voice or music synchronization.

Inputs

prompt — Text description of the video to generate (required)
negative_prompt — Describes content that should not appear in the video
audio — Optional audio file (wav/mp3, 3–30s, ≤15 MB) for voice or music synchronization. If omitted, the model generates matching audio automatically.
resolution — 720p or 1080p (default: 1080p)
aspect_ratio — 16:9, 9:16, 1:1, 4:3, or 3:4 (default: 16:9)
duration — Length in seconds, 2–15 (default: 5)
enable_prompt_expansion — Automatically expand short prompts for better results. Improves quality but adds latency (default: true)
seed — Random seed for reproducible results

Tips

Be descriptive. Include details about the scene, lighting, camera movement, and action. “A golden retriever running through autumn leaves in a park, camera tracking from the side, warm afternoon light” works much better than “a dog in a park.”
Keep durations short. 2–5 second clips tend to produce the most coherent motion and scene consistency.
Use negative prompts to reduce common artifacts — try “blurry, distorted, low quality, static.”
Enable prompt expansion for short prompts. It fills in visual details that improve generation quality.
Pick the right aspect ratio for your use case — 9:16 for vertical/mobile content, 16:9 for widescreen, 1:1 for social.

Limitations

Complex multi-character scenes with specific interactions can be inconsistent.
Text rendering within generated videos is unreliable.
Longer durations (10+ seconds) may show motion degradation or scene drift.
Precise spatial relationships (“object A is to the left of object B”) are not always followed exactly.

Links

Model created 2 months ago

Model updated 1 month, 4 weeks ago