zsxkib

hunyuan-video-lora

Cold

Hunyuan-Video LoRA Explorer + Trainer

Public

42.7K runs

Run with an API

Playground API Examples Train README Versions

Input

prompt

string

Shift + Return to add a new line

In the style of RSNG. A woman with blonde hair stands on a balcony at night, framed against a backdrop of city lights. She wears a white crop top and a dark jacket, exuding a confident presence as she gazes directly at the cameraIn the style of RSNG. A woman with blonde hair stands on a balcony at night, framed against a backdrop of city lights. She wears a white crop top and a dark jacket, exuding a confident presence as she gazes directly at the camera

The text prompt describing your video scene.

Default: ""

lora_url

string

Shift + Return to add a new line

A URL pointing to your LoRA .safetensors file or a Hugging Face repo (e.g. 'user/repo' - uses the first .safetensors file).

Default: ""

lora_strength

number

(minimum: -10, maximum: 10)

Scale/strength for your LoRA.

Default: 1

scheduler

string

Algorithm used to generate the video frames.

Default: "DPMSolverMultistepScheduler"

steps

integer

(minimum: 1, maximum: 150)

Number of diffusion steps.

Default: 50

width

integer

(minimum: 64, maximum: 1536)

Width for the generated video.

Default: 640

height

integer

(minimum: 64, maximum: 1024)

Height for the generated video.

Default: 360

enhance_weight

number

(minimum: 0, maximum: 2)

Strength of the video enhancement effect.

Default: 0.3

enhance_single

boolean

Apply enhancement to individual frames.

Default: true

enhance_double

boolean

Apply enhancement across frame pairs.

Default: true

enhance_start

number

(minimum: 0, maximum: 1)

When to start enhancement in the video. Must be less than enhance_end.

Default: 0

enhance_end

number

(minimum: 0, maximum: 1)

When to end enhancement in the video. Must be greater than enhance_start.

Default: 1

seed

integer

Set a seed for reproducibility. Random by default.

Including guidance_scale and 6 more...

guidance_scale

number

(minimum: 0, maximum: 30)

Overall influence of text vs. model.

Default: 6

flow_shift

integer

(minimum: 0, maximum: 20)

Video continuity factor (flow).

Default: 9

num_frames

integer

(minimum: 1, maximum: 1440)

How many frames (duration) in the resulting video.

Default: 33

denoise_strength

number

(minimum: 0, maximum: 2)

Controls how strongly noise is applied each step.

Default: 1

force_offload

boolean

Whether to force model layers offloaded to CPU.

Default: true

frame_rate

integer

(minimum: 1, maximum: 60)

Video frame rate.

Default: 16

crf

integer

(minimum: 0, maximum: 51)

CRF (quality) for H264 encoding. Lower values = higher quality.

Default: 19

Run this model in Node.js with one line of code:

npx create-replicate --model=zsxkib/hunyuan-video-lora

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";
import fs from "node:fs";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run zsxkib/hunyuan-video-lora using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "zsxkib/hunyuan-video-lora:0e946318f53ed9d89a75cc48c6697a696f2b5e8981e74507a76ab557a938783d",
  {
    input: {
      crf: 19,
      steps: 30,
      width: 512,
      height: 512,
      prompt: "In the style of RSNG. A woman with blonde hair stands on a balcony at night, framed against a backdrop of city lights. She wears a white crop top and a dark jacket, exuding a confident presence as she gazes directly at the camera",
      lora_url: "lucataco/hunyuan-musubi-rose-6",
      scheduler: "DPMSolverMultistepScheduler",
      flow_shift: 9,
      frame_rate: 15,
      num_frames: 33,
      enhance_end: 1,
      enhance_start: 0,
      force_offload: true,
      lora_strength: 1,
      enhance_double: true,
      enhance_single: true,
      enhance_weight: 0.3,
      guidance_scale: 6,
      denoise_strength: 1
    }
  }
);

// To access the file URL:
console.log(output.url()); //=> "http://example.com"

// To write the file to disk:
fs.writeFile("my-image.png", output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run zsxkib/hunyuan-video-lora using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "zsxkib/hunyuan-video-lora:0e946318f53ed9d89a75cc48c6697a696f2b5e8981e74507a76ab557a938783d",
    input={
        "crf": 19,
        "steps": 30,
        "width": 512,
        "height": 512,
        "prompt": "In the style of RSNG. A woman with blonde hair stands on a balcony at night, framed against a backdrop of city lights. She wears a white crop top and a dark jacket, exuding a confident presence as she gazes directly at the camera",
        "lora_url": "lucataco/hunyuan-musubi-rose-6",
        "scheduler": "DPMSolverMultistepScheduler",
        "flow_shift": 9,
        "frame_rate": 15,
        "num_frames": 33,
        "enhance_end": 1,
        "enhance_start": 0,
        "force_offload": True,
        "lora_strength": 1,
        "enhance_double": True,
        "enhance_single": True,
        "enhance_weight": 0.3,
        "guidance_scale": 6,
        "denoise_strength": 1
    }
)

# To access the file URL:
print(output.url())
#=> "http://example.com"

# To write the file to disk:
with open("my-image.png", "wb") as file:
    file.write(output.read())

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run zsxkib/hunyuan-video-lora using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "zsxkib/hunyuan-video-lora:0e946318f53ed9d89a75cc48c6697a696f2b5e8981e74507a76ab557a938783d",
    "input": {
      "crf": 19,
      "steps": 30,
      "width": 512,
      "height": 512,
      "prompt": "In the style of RSNG. A woman with blonde hair stands on a balcony at night, framed against a backdrop of city lights. She wears a white crop top and a dark jacket, exuding a confident presence as she gazes directly at the camera",
      "lora_url": "lucataco/hunyuan-musubi-rose-6",
      "scheduler": "DPMSolverMultistepScheduler",
      "flow_shift": 9,
      "frame_rate": 15,
      "num_frames": 33,
      "enhance_end": 1,
      "enhance_start": 0,
      "force_offload": true,
      "lora_strength": 1,
      "enhance_double": true,
      "enhance_single": true,
      "enhance_weight": 0.3,
      "guidance_scale": 6,
      "denoise_strength": 1
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

You can run this model locally using Cog. First, install Cog:

brew install cog

If you don’t have Homebrew, there are other installation options available.

Run this to download the model and run it in your local environment:

cog predict r8.im/zsxkib/hunyuan-video-lora@sha256:0e946318f53ed9d89a75cc48c6697a696f2b5e8981e74507a76ab557a938783d \
  -i 'crf=19' \
  -i 'steps=30' \
  -i 'width=512' \
  -i 'height=512' \
  -i 'prompt="In the style of RSNG. A woman with blonde hair stands on a balcony at night, framed against a backdrop of city lights. She wears a white crop top and a dark jacket, exuding a confident presence as she gazes directly at the camera"' \
  -i 'lora_url="lucataco/hunyuan-musubi-rose-6"' \
  -i 'scheduler="DPMSolverMultistepScheduler"' \
  -i 'flow_shift=9' \
  -i 'frame_rate=15' \
  -i 'num_frames=33' \
  -i 'enhance_end=1' \
  -i 'enhance_start=0' \
  -i 'force_offload=true' \
  -i 'lora_strength=1' \
  -i 'enhance_double=true' \
  -i 'enhance_single=true' \
  -i 'enhance_weight=0.3' \
  -i 'guidance_scale=6' \
  -i 'denoise_strength=1'

To learn more, take a look at the Cog documentation.

Run this to download the model and run it in your local environment:

docker run -d -p 5000:5000 --gpus=all r8.im/zsxkib/hunyuan-video-lora@sha256:0e946318f53ed9d89a75cc48c6697a696f2b5e8981e74507a76ab557a938783d
curl -s -X POST \
  -H "Content-Type: application/json" \
  -d $'{
    "input": {
      "crf": 19,
      "steps": 30,
      "width": 512,
      "height": 512,
      "prompt": "In the style of RSNG. A woman with blonde hair stands on a balcony at night, framed against a backdrop of city lights. She wears a white crop top and a dark jacket, exuding a confident presence as she gazes directly at the camera",
      "lora_url": "lucataco/hunyuan-musubi-rose-6",
      "scheduler": "DPMSolverMultistepScheduler",
      "flow_shift": 9,
      "frame_rate": 15,
      "num_frames": 33,
      "enhance_end": 1,
      "enhance_start": 0,
      "force_offload": true,
      "lora_strength": 1,
      "enhance_double": true,
      "enhance_single": true,
      "enhance_weight": 0.3,
      "guidance_scale": 6,
      "denoise_strength": 1
    }
  }' \
  http://localhost:5000/predictions

To learn more, take a look at the Cog documentation.

Output

{
  "completed_at": "2025-01-17T19:54:09.871276Z",
  "created_at": "2025-01-17T19:52:06.672000Z",
  "data_removed": false,
  "error": null,
  "id": "qmrfs2n521rm80cmeq98yps03m",
  "input": {
    "crf": 19,
    "steps": 30,
    "width": 512,
    "height": 512,
    "prompt": "In the style of RSNG. A woman with blonde hair stands on a balcony at night, framed against a backdrop of city lights. She wears a white crop top and a dark jacket, exuding a confident presence as she gazes directly at the camera",
    "lora_url": "lucataco/hunyuan-musubi-rose-6",
    "scheduler": "DPMSolverMultistepScheduler",
    "flow_shift": 9,
    "frame_rate": 15,
    "num_frames": 33,
    "enhance_end": 1,
    "enhance_start": 0,
    "force_offload": true,
    "lora_strength": 1,
    "enhance_double": true,
    "enhance_single": true,
    "enhance_weight": 0.3,
    "guidance_scale": 6,
    "denoise_strength": 1
  },
  "logs": "Random seed set to: 2755017607\nChecking inputs\n====================================\nChecking weights\n✅ hunyuan_video_vae_bf16.safetensors exists in ComfyUI/models/vae\n✅ hunyuan_video_720_fp8_e4m3fn.safetensors exists in ComfyUI/models/diffusion_models\n====================================\nRunning workflow\n[ComfyUI] got prompt\nExecuting node 7, title: HunyuanVideo VAE Loader, class type: HyVideoVAELoader\nExecuting node 42, title: HunyuanVideo Enhance A Video, class type: HyVideoEnhanceAVideo\nExecuting node 16, title: (Down)Load HunyuanVideo TextEncoder, class type: DownloadAndLoadHyVideoTextEncoder\n[ComfyUI] Loading text encoder model (clipL) from: /src/ComfyUI/models/clip/clip-vit-large-patch14\n[ComfyUI] Text encoder to dtype: torch.float16\n[ComfyUI] Loading tokenizer (clipL) from: /src/ComfyUI/models/clip/clip-vit-large-patch14\n[ComfyUI] Loading text encoder model (llm) from: /src/ComfyUI/models/LLM/llava-llama-3-8b-text-encoder-tokenizer\n[ComfyUI]\n[ComfyUI] Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]\n[ComfyUI] Loading checkpoint shards:  25%|██▌       | 1/4 [00:00<00:01,  1.78it/s]\n[ComfyUI] Loading checkpoint shards:  50%|█████     | 2/4 [00:01<00:01,  1.75it/s]\n[ComfyUI] Loading checkpoint shards:  75%|███████▌  | 3/4 [00:01<00:00,  1.76it/s]\n[ComfyUI] Loading checkpoint shards: 100%|██████████| 4/4 [00:01<00:00,  2.58it/s]\n[ComfyUI] Loading checkpoint shards: 100%|██████████| 4/4 [00:01<00:00,  2.21it/s]\n[ComfyUI] Text encoder to dtype: torch.float16\n[ComfyUI] Loading tokenizer (llm) from: /src/ComfyUI/models/LLM/llava-llama-3-8b-text-encoder-tokenizer\nExecuting node 30, title: HunyuanVideo TextEncode, class type: HyVideoTextEncode\n[ComfyUI] llm prompt attention_mask shape: torch.Size([1, 161]), masked tokens: 52\n[ComfyUI] clipL prompt attention_mask shape: torch.Size([1, 77]), masked tokens: 54\nExecuting node 41, title: HunyuanVideo Lora Select, class type: HyVideoLoraSelect\nExecuting node 1, title: HunyuanVideo Model Loader, class type: HyVideoModelLoader\n[ComfyUI] model_type FLOW\n[ComfyUI] The config attributes {'use_flow_sigmas': True, 'prediction_type': 'flow_prediction'} were passed to FlowMatchDiscreteScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.\n[ComfyUI] Using accelerate to load and assign model weights to device...\n[ComfyUI] Loading LoRA: lora with strength: 1.0\n[ComfyUI] Requested to load HyVideoModel\n[ComfyUI] loaded completely 9.5367431640625e+25 12555.953247070312 True\n[ComfyUI] Input (height, width, video_length) = (512, 512, 33)\nExecuting node 3, title: HunyuanVideo Sampler, class type: HyVideoSampler\n[ComfyUI] The config attributes {'reverse': True, 'solver': 'euler'} were passed to DPMSolverMultistepScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.\n[ComfyUI] Sampling 33 frames in 9 latents at 512x512 with 30 inference steps\n[ComfyUI] Scheduler config: FrozenDict([('num_train_timesteps', 1000), ('flow_shift', 9.0), ('reverse', True), ('solver', 'euler'), ('n_tokens', None), ('_use_default_values', ['n_tokens', 'num_train_timesteps'])])[ComfyUI]\n[ComfyUI] 0%|          | 0/30 [00:00<?, ?it/s]\n[ComfyUI] 3%|▎         | 1/30 [00:01<00:33,  1.16s/it]\n[ComfyUI] 7%|▋         | 2/30 [00:01<00:26,  1.04it/s]\n[ComfyUI] 10%|█         | 3/30 [00:03<00:26,  1.00it/s]\n[ComfyUI] 13%|█▎        | 4/30 [00:04<00:26,  1.01s/it]\n[ComfyUI] 17%|█▋        | 5/30 [00:05<00:25,  1.02s/it]\n[ComfyUI] 20%|██        | 6/30 [00:06<00:24,  1.03s/it]\n[ComfyUI] 23%|██▎       | 7/30 [00:07<00:23,  1.03s/it]\n[ComfyUI] 27%|██▋       | 8/30 [00:08<00:22,  1.03s/it]\n[ComfyUI] 30%|███       | 9/30 [00:09<00:21,  1.03s/it]\n[ComfyUI] 33%|███▎      | 10/30 [00:10<00:20,  1.04s/it]\n[ComfyUI] 37%|███▋      | 11/30 [00:11<00:19,  1.04s/it]\n[ComfyUI] 40%|████      | 12/30 [00:12<00:18,  1.04s/it]\n[ComfyUI] 43%|████▎     | 13/30 [00:13<00:18,  1.08s/it]\n[ComfyUI] 47%|████▋     | 14/30 [00:14<00:17,  1.06s/it]\n[ComfyUI] 50%|█████     | 15/30 [00:15<00:15,  1.06s/it]\n[ComfyUI] 53%|█████▎    | 16/30 [00:16<00:14,  1.05s/it]\n[ComfyUI] 57%|█████▋    | 17/30 [00:17<00:13,  1.05s/it]\n[ComfyUI] 60%|██████    | 18/30 [00:18<00:12,  1.05s/it]\n[ComfyUI] 63%|██████▎   | 19/30 [00:19<00:11,  1.04s/it]\n[ComfyUI] 67%|██████▋   | 20/30 [00:20<00:10,  1.04s/it]\n[ComfyUI] 70%|███████   | 21/30 [00:21<00:09,  1.04s/it]\n[ComfyUI] 73%|███████▎  | 22/30 [00:22<00:08,  1.04s/it]\n[ComfyUI] 77%|███████▋  | 23/30 [00:23<00:07,  1.04s/it]\n[ComfyUI] 80%|████████  | 24/30 [00:24<00:06,  1.04s/it]\n[ComfyUI] 83%|████████▎ | 25/30 [00:26<00:05,  1.04s/it]\n[ComfyUI] 87%|████████▋ | 26/30 [00:27<00:04,  1.04s/it]\n[ComfyUI] 90%|█████████ | 27/30 [00:28<00:03,  1.04s/it]\n[ComfyUI] 93%|█████████▎| 28/30 [00:29<00:02,  1.04s/it]\n[ComfyUI] 97%|█████████▋| 29/30 [00:30<00:01,  1.04s/it]\n[ComfyUI] 100%|██████████| 30/30 [00:31<00:00,  1.04s/it]\n[ComfyUI] 100%|██████████| 30/30 [00:31<00:00,  1.04s/it]\n[ComfyUI] Allocated memory: memory=12.757 GB\n[ComfyUI] Max allocated memory: max_memory=14.346 GB\n[ComfyUI] Max reserved memory: max_reserved=14.812 GB\nExecuting node 5, title: HunyuanVideo Decode, class type: HyVideoDecode\n[ComfyUI]\n[ComfyUI] Decoding rows:   0%|          | 0/3 [00:00<?, ?it/s]\n[ComfyUI] Decoding rows:  33%|███▎      | 1/3 [00:00<00:01,  1.80it/s]\n[ComfyUI] Decoding rows:  67%|██████▋   | 2/3 [00:01<00:00,  1.74it/s]\n[ComfyUI] Decoding rows: 100%|██████████| 3/3 [00:01<00:00,  2.11it/s]\n[ComfyUI] Decoding rows: 100%|██████████| 3/3 [00:01<00:00,  2.00it/s]\n[ComfyUI]\n[ComfyUI] Blending tiles:   0%|          | 0/3 [00:00<?, ?it/s]\nExecuting node 34, title: Video Combine 🎥🅥🅗🅢, class type: VHS_VideoCombine\n[ComfyUI] Blending tiles: 100%|██████████| 3/3 [00:00<00:00, 51.92it/s]\n[ComfyUI] Prompt executed in 62.51 seconds\noutputs:  {'34': {'gifs': [{'filename': 'HunyuanVideo_00001.mp4', 'subfolder': '', 'type': 'output', 'format': 'video/h264-mp4', 'frame_rate': 15.0, 'workflow': 'HunyuanVideo_00001.png', 'fullpath': '/tmp/outputs/HunyuanVideo_00001.mp4'}]}}\n====================================\nHunyuanVideo_00001.png\nHunyuanVideo_00001.mp4",
  "metrics": {
    "predict_time": 94.368714911,
    "total_time": 123.199276
  },
  "output": "https://replicate.delivery/xezq/JB66or6rpw4XCBK9OVDIkW4OSZH599iEnC47btdZckaYnhBF/HunyuanVideo_00001.mp4",
  "started_at": "2025-01-17T19:52:35.502561Z",
  "status": "succeeded",
  "urls": {
    "stream": "https://stream.replicate.com/v1/files/bcwr-knewvnm2tsidbgoiemo56kapb76me4k2sfzwjca7bito4cf4zmla",
    "get": "https://api.replicate.com/v1/predictions/qmrfs2n521rm80cmeq98yps03m",
    "cancel": "https://api.replicate.com/v1/predictions/qmrfs2n521rm80cmeq98yps03m/cancel"
  },
  "version": "04279caf015c30a635cabc4077b5bd82c5c706262eb61797a48db139444bcca9"
}

Generated in

1 minute 34 seconds

Tweak it Iterate in playground ShareReport View full prediction

Random seed set to: 2755017607
Checking inputs
====================================
Checking weights
✅ hunyuan_video_vae_bf16.safetensors exists in ComfyUI/models/vae
✅ hunyuan_video_720_fp8_e4m3fn.safetensors exists in ComfyUI/models/diffusion_models
====================================
Running workflow
[ComfyUI] got prompt
Executing node 7, title: HunyuanVideo VAE Loader, class type: HyVideoVAELoader
Executing node 42, title: HunyuanVideo Enhance A Video, class type: HyVideoEnhanceAVideo
Executing node 16, title: (Down)Load HunyuanVideo TextEncoder, class type: DownloadAndLoadHyVideoTextEncoder
[ComfyUI] Loading text encoder model (clipL) from: /src/ComfyUI/models/clip/clip-vit-large-patch14
[ComfyUI] Text encoder to dtype: torch.float16
[ComfyUI] Loading tokenizer (clipL) from: /src/ComfyUI/models/clip/clip-vit-large-patch14
[ComfyUI] Loading text encoder model (llm) from: /src/ComfyUI/models/LLM/llava-llama-3-8b-text-encoder-tokenizer
[ComfyUI]
[ComfyUI] Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]
[ComfyUI] Loading checkpoint shards:  25%|██▌       | 1/4 [00:00<00:01,  1.78it/s]
[ComfyUI] Loading checkpoint shards:  50%|█████     | 2/4 [00:01<00:01,  1.75it/s]
[ComfyUI] Loading checkpoint shards:  75%|███████▌  | 3/4 [00:01<00:00,  1.76it/s]
[ComfyUI] Loading checkpoint shards: 100%|██████████| 4/4 [00:01<00:00,  2.58it/s]
[ComfyUI] Loading checkpoint shards: 100%|██████████| 4/4 [00:01<00:00,  2.21it/s]
[ComfyUI] Text encoder to dtype: torch.float16
[ComfyUI] Loading tokenizer (llm) from: /src/ComfyUI/models/LLM/llava-llama-3-8b-text-encoder-tokenizer
Executing node 30, title: HunyuanVideo TextEncode, class type: HyVideoTextEncode
[ComfyUI] llm prompt attention_mask shape: torch.Size([1, 161]), masked tokens: 52
[ComfyUI] clipL prompt attention_mask shape: torch.Size([1, 77]), masked tokens: 54
Executing node 41, title: HunyuanVideo Lora Select, class type: HyVideoLoraSelect
Executing node 1, title: HunyuanVideo Model Loader, class type: HyVideoModelLoader
[ComfyUI] model_type FLOW
[ComfyUI] The config attributes {'use_flow_sigmas': True, 'prediction_type': 'flow_prediction'} were passed to FlowMatchDiscreteScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.
[ComfyUI] Using accelerate to load and assign model weights to device...
[ComfyUI] Loading LoRA: lora with strength: 1.0
[ComfyUI] Requested to load HyVideoModel
[ComfyUI] loaded completely 9.5367431640625e+25 12555.953247070312 True
[ComfyUI] Input (height, width, video_length) = (512, 512, 33)
Executing node 3, title: HunyuanVideo Sampler, class type: HyVideoSampler
[ComfyUI] The config attributes {'reverse': True, 'solver': 'euler'} were passed to DPMSolverMultistepScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.
[ComfyUI] Sampling 33 frames in 9 latents at 512x512 with 30 inference steps
[ComfyUI] Scheduler config: FrozenDict([('num_train_timesteps', 1000), ('flow_shift', 9.0), ('reverse', True), ('solver', 'euler'), ('n_tokens', None), ('_use_default_values', ['n_tokens', 'num_train_timesteps'])])[ComfyUI]
[ComfyUI] 0%|          | 0/30 [00:00<?, ?it/s]
[ComfyUI] 3%|▎         | 1/30 [00:01<00:33,  1.16s/it]
[ComfyUI] 7%|▋         | 2/30 [00:01<00:26,  1.04it/s]
[ComfyUI] 10%|█         | 3/30 [00:03<00:26,  1.00it/s]
[ComfyUI] 13%|█▎        | 4/30 [00:04<00:26,  1.01s/it]
[ComfyUI] 17%|█▋        | 5/30 [00:05<00:25,  1.02s/it]
[ComfyUI] 20%|██        | 6/30 [00:06<00:24,  1.03s/it]
[ComfyUI] 23%|██▎       | 7/30 [00:07<00:23,  1.03s/it]
[ComfyUI] 27%|██▋       | 8/30 [00:08<00:22,  1.03s/it]
[ComfyUI] 30%|███       | 9/30 [00:09<00:21,  1.03s/it]
[ComfyUI] 33%|███▎      | 10/30 [00:10<00:20,  1.04s/it]
[ComfyUI] 37%|███▋      | 11/30 [00:11<00:19,  1.04s/it]
[ComfyUI] 40%|████      | 12/30 [00:12<00:18,  1.04s/it]
[ComfyUI] 43%|████▎     | 13/30 [00:13<00:18,  1.08s/it]
[ComfyUI] 47%|████▋     | 14/30 [00:14<00:17,  1.06s/it]
[ComfyUI] 50%|█████     | 15/30 [00:15<00:15,  1.06s/it]
[ComfyUI] 53%|█████▎    | 16/30 [00:16<00:14,  1.05s/it]
[ComfyUI] 57%|█████▋    | 17/30 [00:17<00:13,  1.05s/it]
[ComfyUI] 60%|██████    | 18/30 [00:18<00:12,  1.05s/it]
[ComfyUI] 63%|██████▎   | 19/30 [00:19<00:11,  1.04s/it]
[ComfyUI] 67%|██████▋   | 20/30 [00:20<00:10,  1.04s/it]
[ComfyUI] 70%|███████   | 21/30 [00:21<00:09,  1.04s/it]
[ComfyUI] 73%|███████▎  | 22/30 [00:22<00:08,  1.04s/it]
[ComfyUI] 77%|███████▋  | 23/30 [00:23<00:07,  1.04s/it]
[ComfyUI] 80%|████████  | 24/30 [00:24<00:06,  1.04s/it]
[ComfyUI] 83%|████████▎ | 25/30 [00:26<00:05,  1.04s/it]
[ComfyUI] 87%|████████▋ | 26/30 [00:27<00:04,  1.04s/it]
[ComfyUI] 90%|█████████ | 27/30 [00:28<00:03,  1.04s/it]
[ComfyUI] 93%|█████████▎| 28/30 [00:29<00:02,  1.04s/it]
[ComfyUI] 97%|█████████▋| 29/30 [00:30<00:01,  1.04s/it]
[ComfyUI] 100%|██████████| 30/30 [00:31<00:00,  1.04s/it]
[ComfyUI] 100%|██████████| 30/30 [00:31<00:00,  1.04s/it]
[ComfyUI] Allocated memory: memory=12.757 GB
[ComfyUI] Max allocated memory: max_memory=14.346 GB
[ComfyUI] Max reserved memory: max_reserved=14.812 GB
Executing node 5, title: HunyuanVideo Decode, class type: HyVideoDecode
[ComfyUI]
[ComfyUI] Decoding rows:   0%|          | 0/3 [00:00<?, ?it/s]
[ComfyUI] Decoding rows:  33%|███▎      | 1/3 [00:00<00:01,  1.80it/s]
[ComfyUI] Decoding rows:  67%|██████▋   | 2/3 [00:01<00:00,  1.74it/s]
[ComfyUI] Decoding rows: 100%|██████████| 3/3 [00:01<00:00,  2.11it/s]
[ComfyUI] Decoding rows: 100%|██████████| 3/3 [00:01<00:00,  2.00it/s]
[ComfyUI]
[ComfyUI] Blending tiles:   0%|          | 0/3 [00:00<?, ?it/s]
Executing node 34, title: Video Combine 🎥🅥🅗🅢, class type: VHS_VideoCombine
[ComfyUI] Blending tiles: 100%|██████████| 3/3 [00:00<00:00, 51.92it/s]
[ComfyUI] Prompt executed in 62.51 seconds
outputs:  {'34': {'gifs': [{'filename': 'HunyuanVideo_00001.mp4', 'subfolder': '', 'type': 'output', 'format': 'video/h264-mp4', 'frame_rate': 15.0, 'workflow': 'HunyuanVideo_00001.png', 'fullpath': '/tmp/outputs/HunyuanVideo_00001.mp4'}]}}
====================================
HunyuanVideo_00001.png
HunyuanVideo_00001.mp4

This output was created using a different version of the model, zsxkib/hunyuan-video-lora:04279caf.

Examples

View more examples

Run time and cost

This model costs approximately $0.26 to run on Replicate, or 3 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia H100 GPU hardware. Predictions typically complete within 3 minutes. The predict time for this model varies significantly based on the inputs.

Readme

HunyuanVideo with LoRA Support 🎬

⚠️ This model supports LoRA training (click on the train tab)

Turn your text descriptions into videos using HunyuanVideo, now with support for custom LoRA files! LoRAs are like style plugins that help you customize how your videos look without changing the main model.

What’s This All About? ✨

This is a text-to-video AI model that lets you:

Create videos just by describing what you want to see
Use custom LoRA files to add your own style or characters
Train your own custom models that will be saved as destination models
Control various aspects of how your video turns out

Think of it like having an AI video creator that you can teach new styles!

How It Works 🎥

Under the hood, this uses HunyuanVideo - a powerful AI model that turns text into videos. We’ve added support for LoRA files, which are like special add-ons that can make the videos look more like what you want. For example, you could use a LoRA trained on anime art to make your videos look more animated!

The cool part is that even if your LoRA was only trained on still images, it can still create smooth-moving videos. You can also train your own models using your dataset, and once training is complete, it’ll automatically create a destination model ready for use. Pretty neat, right?

What You Can Control 🎮

When creating your video, you can adjust things like:

Your text description of what you want to see
Your custom LoRA file (must be .safetensors format)
How strongly your LoRA affects the final video
Video size (width and height)
How many frames you want
Video speed (frames per second)
Video quality settings
Training parameters when creating your own models

Current Limits ⚠️

Since this is a work in progress, there are some limitations:

Videos can’t be bigger than 1536x1024
Maximum length is 300 frames
You need to use .safetensors format for LoRA files
Bigger videos take longer to make
Might need a beefy computer for larger videos

Coming Soon! 🚀

I’m (zsxkib) working on adding LoRA training directly to Replicate! This means you’ll be able to:

Train your own LoRAs right here on Replicate
Use them immediately for video generation
Share them with others
Get an automatically created destination model after training completes

Stay tuned for updates!

Credits and Thanks 📚

This builds on the amazing work by Tencent’s HunyuanVideo team:

@misc{kong2024hunyuanvideo,
      title={HunyuanVideo: A Systematic Framework For Large Video Generative Models}, 
      author={Weijie Kong, et al.},
      year={2024},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Special thanks to Jukka Seppänen (@Kijaidesign) for creating the fantastic ComfyUI implementation that makes this all possible. His ComfyUI nodes are the backbone of this project!

Follow me on Twitter/X @zsakib_ for updates on LoRA training and other cool features!