zsxkib / wan-lora-trainer

📽️Fine-tune the Wan2.1 both 14b & 1.3b video model using video datasets🎥

  • Public
  • 262 runs
  • Commercial use
  • GitHub
  • Weights
  • License

Create training

Trainings for this model run on Nvidia H100 GPU hardware, which costs $0.001525 per second. Upon creation, you will be redirected to the training detail page where you can monitor your training's progress, and eventually download the weights and run the trained model.

Note: versions of this model with fast booting use the hardware set by the base model they were trained from.

Learn more about training

If you haven't yet trained a model on Replicate, you can read one of the following guides.

*string

Select a model on Replicate that will be the destination for the trained version. If the model does not exist, select the "Create model" option and a field will appear to enter the name of the new model. We'll create the model for you when you create the training.

*file

A zip file containing video and caption data for training. The zip should contain at least one video file (e.g., mp4, mov) and optionally caption files (.txt).

string
Shift + Return to add a new line

The trigger word to be associated with all videos during training. This word will help activate the LoRA when used in prompts.

Default: "TOK"

boolean

Automatically caption videos using QWEN-VL that don't have matching caption files.

Default: true

string
Shift + Return to add a new line

Optional: Text you want to appear at the beginning of all your generated captions; for example, 'a video of TOK, '. You can include your trigger word in the prefix.

Default: ""

string
Shift + Return to add a new line

Optional: Text you want to appear at the end of all your generated captions; for example, ' in the style of TOK'. You can include your trigger word in suffixes.

Default: ""

integer
(minimum: 10, maximum: 20000)

Total number of training steps, including warmup. For example, if max_training_steps=1000 and warmup_steps_budget=100, then you have 1000 steps total, with the first 100 for warmup.

Default: 1000

number
(minimum: 0.00001, maximum: 0.01)

Learning rate for training. Higher values may lead to faster convergence but potential instability.

Default: 0.00002

integer
(minimum: 16, maximum: 256)

LoRA rank for training. Higher ranks can capture more complex features but require more training time.

Default: 32

integer
(minimum: -1, maximum: 2000)

If not provided or set to -1, defaults to 10% of max_training_steps. These steps ramp from a lower LR to the configured LR. They are included within max_training_steps, not added on top.

Default: -1

number
(minimum: 0, maximum: 0.1)

Weight decay for regularization. Controls overfitting - lower values allow more detailed memorization.

Default: 0.0001

integer

Random seed for training reproducibility. Use -1 for a random seed.

Default: -1

string
Shift + Return to add a new line

Hugging Face repository ID, if you'd like to upload the trained LoRA to Hugging Face. For example, username/wan-lora. If the given repo does not exist, a new public repo will be created.

secret

A secret has its value redacted after being sent to the model.

Hugging Face token, if you'd like to upload the trained LoRA to Hugging Face.

secret

A secret has its value redacted after being sent to the model.

Weights and Biases API key for experiment tracking. If provided, training progress will be logged to W&B.

string
Shift + Return to add a new line

Weights and Biases project name. A new project will be created if it doesn't exist.

Default: "wan_train_replicate"

string
Shift + Return to add a new line

Weights and Biases run name. If not provided, a random name will be generated.

string

An enumeration.

string
Shift + Return to add a new line

Weights and Biases entity (username or organization). If not provided, will use your default entity.

string

An enumeration.

string

An enumeration.

string
Shift + Return to add a new line

Newline-separated list of prompts to use for sample generation. These will be logged to W&B with your trigger word.

integer
(minimum: 1)

Step interval for sampling output videos to W&B

Default: 250

integer
(minimum: 1)

Step interval for saving checkpoints to W&B

Default: 500