zsxkib / create-video-dataset

Easily create video datasets with auto-captioning for Hunyuan-Video LoRA finetuning

  • Public
  • 606 runs
  • GitHub
  • License
Iterate in playground

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Create Video Dataset

A tool to easily prepare video datasets with automatic captioning for AI training. This tool processes videos (from URLs or local files), generates high-quality captions using QWEN-VL, and packages everything into a training-ready format.

Features

  • πŸŽ₯ Process YouTube URLs or local video files
  • πŸ€– Automatic video captioning using QWEN-VL
  • ✍️ Support for custom captions
  • 🏷️ Configurable trigger words for training
  • πŸ“ Prefix/suffix support for caption formatting
  • πŸ—ƒοΈ Clean output in zip format

Input Parameters

Parameter Description Default
video_url YouTube/video URL to process None
video_file Local video file to process None
trigger_word Training trigger word (e.g., TOK, STYLE3D) β€œTOK”
autocaption Use AI to generate captions True
custom_caption Your custom caption (required if autocaption=False) None
autocaption_prefix Text to add before captions None
autocaption_suffix Text to add after captions None

Output

The tool produces a zip file containing: - Processed video file - Caption files (.txt) for each video - Proper directory structure for training