minimax/hailuo-2.3

A high-fidelity video generation model optimized for realistic human motion, cinematic VFX, expressive characters, and strong prompt and style adherence across both text-to-video and image-to-video workflows

600 runs

MiniMax-Hailuo-2.3

MiniMax-Hailuo-2.3 is a video generation model family designed for realistic motion, improved visual consistency, and high-fidelity stylization. It supports both text-to-video and image-to-video workflows, with two performance profiles: a standard model with broader capabilities, and a fast model variant optimized for reduced latency.

Model Variants

MiniMax-Hailuo-2.3

The core model supports both text and image as input. It is intended for cinematic workflows, realistic animation, and high-detail scenes.

  • Input: Text and Image
  • Resolution: 768p and 1080p
    1080p videos are limited to 6-second duration
  • Duration Options: 6 seconds or 10 seconds
  • Aspect Ratio
  • Image-to-Video: follows source image
  • Text-to-Video: defaults to 16:9
  • Last Frame Handling: not supported

This model provides the highest visual quality and the broadest feature coverage within the family.

MiniMax-Hailuo-2.3-Fast

The fast variant is optimized for turnaround time and computational efficiency.

  • Input: Image only
  • Resolution: 768p and 1080p
    1080p videos are limited to 6-second duration
  • Duration Options: 6 seconds or 10 seconds
  • Aspect Ratio
  • Image-to-Video: follows source image
  • Last Frame Handling: not supported

This version offers quicker feedback during iteration or prototyping cycles.