zsxkib / hunyuan-video-lora

Hunyuan-Video LoRA Explorer + Trainer

  • Public
  • 22.1K runs
  • H100
  • GitHub
  • Weights
  • Paper
  • License

Input

string
Shift + Return to add a new line

The text prompt describing your video scene.

Default: ""

string
Shift + Return to add a new line

A URL pointing to your LoRA .safetensors file or a Hugging Face repo (e.g. 'user/repo' - uses the first .safetensors file).

Default: ""

number
(minimum: -10, maximum: 10)

Scale/strength for your LoRA.

Default: 1

string

Algorithm used to generate the video frames.

Default: "DPMSolverMultistepScheduler"

integer
(minimum: 1, maximum: 150)

Number of diffusion steps.

Default: 50

integer
(minimum: 64, maximum: 1536)

Width for the generated video.

Default: 640

integer
(minimum: 64, maximum: 1024)

Height for the generated video.

Default: 360

number
(minimum: 0, maximum: 2)

Strength of the video enhancement effect.

Default: 0.3

boolean

Apply enhancement to individual frames.

Default: true

boolean

Apply enhancement across frame pairs.

Default: true

number
(minimum: 0, maximum: 1)

When to start enhancement in the video. Must be less than enhance_end.

Default: 0

number
(minimum: 0, maximum: 1)

When to end enhancement in the video. Must be greater than enhance_start.

Default: 1

integer

Set a seed for reproducibility. Random by default.

Including guidance_scale and 6 more...
number
(minimum: 0, maximum: 30)

Overall influence of text vs. model.

Default: 6

integer
(minimum: 0, maximum: 20)

Video continuity factor (flow).

Default: 9

integer
(minimum: 1, maximum: 1440)

How many frames (duration) in the resulting video.

Default: 33

number
(minimum: 0, maximum: 2)

Controls how strongly noise is applied each step.

Default: 1

boolean

Whether to force model layers offloaded to CPU.

Default: true

integer
(minimum: 1, maximum: 60)

Video frame rate.

Default: 16

integer
(minimum: 0, maximum: 51)

CRF (quality) for H264 encoding. Lower values = higher quality.

Default: 19

Output

Generated in

This example was created by a different version, zsxkib/hunyuan-video-lora:04279caf.

Run time and cost

This model costs approximately $0.11 to run on Replicate, or 9 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia H100 GPU hardware. Predictions typically complete within 69 seconds. The predict time for this model varies significantly based on the inputs.

Readme

HunyuanVideo with LoRA Support 🎬

⚠️ This model supports LoRA training (click on the train tab)

Turn your text descriptions into videos using HunyuanVideo, now with support for custom LoRA files! LoRAs are like style plugins that help you customize how your videos look without changing the main model.

What’s This All About? ✨

This is a text-to-video AI model that lets you:

  • Create videos just by describing what you want to see

  • Use custom LoRA files to add your own style or characters

  • Train your own custom models that will be saved as destination models

  • Control various aspects of how your video turns out

Think of it like having an AI video creator that you can teach new styles!

How It Works 🎥

Under the hood, this uses HunyuanVideo - a powerful AI model that turns text into videos. We’ve added support for LoRA files, which are like special add-ons that can make the videos look more like what you want. For example, you could use a LoRA trained on anime art to make your videos look more animated!

The cool part is that even if your LoRA was only trained on still images, it can still create smooth-moving videos. You can also train your own models using your dataset, and once training is complete, it’ll automatically create a destination model ready for use. Pretty neat, right?

What You Can Control 🎮

When creating your video, you can adjust things like:

  • Your text description of what you want to see

  • Your custom LoRA file (must be .safetensors format)

  • How strongly your LoRA affects the final video

  • Video size (width and height)

  • How many frames you want

  • Video speed (frames per second)

  • Video quality settings

  • Training parameters when creating your own models

Current Limits ⚠️

Since this is a work in progress, there are some limitations:

  • Videos can’t be bigger than 1536x1024

  • Maximum length is 300 frames

  • You need to use .safetensors format for LoRA files

  • Bigger videos take longer to make

  • Might need a beefy computer for larger videos

Coming Soon! 🚀

I’m (zsxkib) working on adding LoRA training directly to Replicate! This means you’ll be able to:

  1. Train your own LoRAs right here on Replicate

  2. Use them immediately for video generation

  3. Share them with others

  4. Get an automatically created destination model after training completes

Stay tuned for updates!

Credits and Thanks 📚

This builds on the amazing work by Tencent’s HunyuanVideo team:

@misc{kong2024hunyuanvideo,
      title={HunyuanVideo: A Systematic Framework For Large Video Generative Models}, 
      author={Weijie Kong, et al.},
      year={2024},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Special thanks to Jukka Seppänen (@Kijaidesign) for creating the fantastic ComfyUI implementation that makes this all possible. His ComfyUI nodes are the backbone of this project!


Follow me on Twitter/X @zsakib_ for updates on LoRA training and other cool features!