nsfw-api/hunyuan-character-lora-trainer:b1af78df

You're looking at a specific version of this model. Jump to the model overview.

nsfw-api /hunyuan-character-lora-trainer:b1af78df

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
input_videos	string		A zip file containing videos/images and optionally matching .txt captions. (For videos (e.g. .mp4) the caption file must be named the same (video.mp4 and video.txt) and for images (e.g. .jpg, .jpeg or .png), the same applies.)
trigger_word	string	TOK	The trigger word refers to the object, style or concept you are training on. Pick a string that isn't a real word, like TOK or something related to what's being trained, like STYLE3D. The trigger word you specify here will be associated with all videos during training. Then when you use your LoRA, you can include the trigger word in prompts to help activate the LoRA.
autocaption	boolean	True	Automatically caption videos using QWEN-VL
autocaption_prefix	string		Optional: Text you want to appear at the beginning of all your generated captions; for example, 'a video of TOK, '. You can include your trigger word in the prefix. Prefixes help set the right context for your captions.
autocaption_suffix	string		Optional: Text you want to appear at the end of all your generated captions; for example, ' in the style of TOK'. You can include your trigger word in suffixes. Suffixes help set the right concept for your captions.
epochs	integer	16 Min: 1 Max: 2000	Number of training epochs. Each epoch processes all your videos once. Note: If max_train_steps is set, training may end before completing all epochs.
max_train_steps	integer	-1 Min: -1 Max: 1000000	Maximum number of training steps to perform. Each step processes one batch of frames. Set to -1 to train for the full number of epochs. If positive, training will stop after this many steps even if all epochs aren't complete.
rank	integer	32 Min: 1 Max: 128	LoRA rank for training. Higher ranks take longer to train but can capture more complex features. Caption quality is more important for higher ranks.
batch_size	integer	4 Min: 1 Max: 8	Batch size for training. Lower values use less memory but train slower.
learning_rate	number	0.001 Min: 0.00001 Max: 1	Learning rate for training. If you're new to training you probably don't need to change this.
optimizer	None	adamw8bit	Optimizer type for training. If you're unsure, leave as default.
timestep_sampling	None	sigmoid	Controls how timesteps are sampled during training. 'sigmoid' (default) concentrates samples in the middle of the diffusion process. 'uniform' samples evenly across all timesteps. 'sigma' samples based on the noise schedule. 'shift' uses shifted sampling with discrete flow shift. If unsure, use 'sigmoid'.
consecutive_target_frames	None	[1, 25, 45]	The lengths of consecutive frames to extract from each video.
frame_extraction_method	None	head	Method to extract frames from videos during training.
frame_stride	integer	10 Min: 1 Max: 100	Frame stride for 'slide' extraction method.
frame_sample	integer	4 Min: 1 Max: 20	Number of samples for 'uniform' extraction method.
seed	integer	0	Random seed for training. Use <=0 for random.
hf_repo_id	string		Hugging Face repository ID, if you'd like to upload the trained LoRA to Hugging Face. For example, username/my-video-lora. If the given repo does not exist, a new public repo will be created.
hf_token	string		Hugging Face token, if you'd like to upload the trained LoRA to Hugging Face.

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema

{'properties': {'weights': {'format': 'uri',
                            'title': 'Weights',
                            'type': 'string'}},
 'required': ['weights'],
 'title': 'Output',
 'type': 'object'}