zsxkib / create-video-dataset

Easily create video datasets with auto-captioning for Hunyuan-Video LoRA finetuning

Cold

Public
628 runs
L40S
GitHub
License

Iterate in playground

Run with an API

Playground API Examples README Versions

Input

video_url

string

Shift + Return to add a new line

https://www.youtube.com/watch?v=dQw4w9WgXcQhttps://www.youtube.com/watch?v=dQw4w9WgXcQ

YouTube/video URL to process. Leave empty if uploading a file. Note: URL takes precedence if both URL and file are provided.

video_file

file

Video file to process. Leave empty if using URL. Ignored if URL is provided.

detection_mode

string

Scene detection method: 'content' (fast cuts), 'adaptive' (camera movement), or 'threshold' (fades)

Default: "content"

min_scene_length

number

Minimum scene length in seconds

Default: 1

max_scene_length

number

Maximum scene length in seconds

Default: 10

num_scenes

integer

Number of scenes to extract (0 = all detected scenes)

Default: 4

target_fps

number

Target frame rate (e.g. 24, 30). Set to -1 to keep original fps.

Default: 24

start_time

number

Start time in seconds for video processing

Default: 0

end_time

number

End time in seconds for video processing. Set to 0 to process until the end.

Default: 0

skip_intro

boolean

Automatically skip first 10 seconds (typical intro)

Default: false

preview_only

boolean

Generate scene previews without creating full dataset

Default: false

quality

string

Video quality preset: 'fast' (lower quality, smaller files), 'balanced', or 'high' (best quality, larger files)

Default: "balanced"

autocaption

boolean

Let AI generate a caption for your video. If False, you must provide custom_caption.

Default: true

caption_style

string

Caption style: 'minimal' (short), 'detailed' (longer descriptions), or 'custom'

Default: "detailed"

custom_caption

string

Shift + Return to add a new line

Your custom caption. Required if caption_style is 'custom' or autocaption is False.

trigger_word

string

Shift + Return to add a new line

Trigger word to include in captions (e.g., TOK, STYLE3D). Will be added at start of caption.

Default: "TOK"

autocaption_prefix

string

Shift + Return to add a new line

Text to add BEFORE caption. Example: 'a video of'

autocaption_suffix

string

Shift + Return to add a new line

Text to add AFTER caption. Example: 'in a cinematic style'

Run this model in Node.js with one line of code:

npx create-replicate --model=zsxkib/create-video-dataset

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";
import fs from "node:fs";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run zsxkib/create-video-dataset using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "zsxkib/create-video-dataset:c88dab32692c79764ee43fe43956cf81e0592065150d7d5d5d67090608a6dc5d",
  {
    input: {
      quality: "balanced",
      end_time: 40,
      video_url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
      num_scenes: 4,
      skip_intro: false,
      start_time: 10,
      target_fps: 24,
      autocaption: true,
      preview_only: false,
      trigger_word: "RICKROLL",
      caption_style: "detailed",
      detection_mode: "content",
      max_scene_length: 10,
      min_scene_length: 1,
      autocaption_prefix: "a video of RICKROLL, "
    }
  }
);

// To access the file URL:
console.log(output[0].url()); //=> "http://example.com"

// To write the file to disk:
fs.writeFile("my-image.png", output[0]);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run zsxkib/create-video-dataset using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "zsxkib/create-video-dataset:c88dab32692c79764ee43fe43956cf81e0592065150d7d5d5d67090608a6dc5d",
    input={
        "quality": "balanced",
        "end_time": 40,
        "video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
        "num_scenes": 4,
        "skip_intro": False,
        "start_time": 10,
        "target_fps": 24,
        "autocaption": True,
        "preview_only": False,
        "trigger_word": "RICKROLL",
        "caption_style": "detailed",
        "detection_mode": "content",
        "max_scene_length": 10,
        "min_scene_length": 1,
        "autocaption_prefix": "a video of RICKROLL, "
    }
)

# To access the file URL:
print(output[0].url())
#=> "http://example.com"

# To write the file to disk:
with open("my-image.png", "wb") as file:
    file.write(output[0].read())

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run zsxkib/create-video-dataset using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "zsxkib/create-video-dataset:c88dab32692c79764ee43fe43956cf81e0592065150d7d5d5d67090608a6dc5d",
    "input": {
      "quality": "balanced",
      "end_time": 40,
      "video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
      "num_scenes": 4,
      "skip_intro": false,
      "start_time": 10,
      "target_fps": 24,
      "autocaption": true,
      "preview_only": false,
      "trigger_word": "RICKROLL",
      "caption_style": "detailed",
      "detection_mode": "content",
      "max_scene_length": 10,
      "min_scene_length": 1,
      "autocaption_prefix": "a video of RICKROLL, "
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

You can run this model locally using Cog. First, install Cog:

brew install cog

If you don’t have Homebrew, there are other installation options available.

Run this to download the model and run it in your local environment:

cog predict r8.im/zsxkib/create-video-dataset@sha256:c88dab32692c79764ee43fe43956cf81e0592065150d7d5d5d67090608a6dc5d \
  -i 'quality="balanced"' \
  -i 'end_time=40' \
  -i 'video_url="https://www.youtube.com/watch?v=dQw4w9WgXcQ"' \
  -i 'num_scenes=4' \
  -i 'skip_intro=false' \
  -i 'start_time=10' \
  -i 'target_fps=24' \
  -i 'autocaption=true' \
  -i 'preview_only=false' \
  -i 'trigger_word="RICKROLL"' \
  -i 'caption_style="detailed"' \
  -i 'detection_mode="content"' \
  -i 'max_scene_length=10' \
  -i 'min_scene_length=1' \
  -i 'autocaption_prefix="a video of RICKROLL, "'

To learn more, take a look at the Cog documentation.

Run this to download the model and run it in your local environment:

docker run -d -p 5000:5000 --gpus=all r8.im/zsxkib/create-video-dataset@sha256:c88dab32692c79764ee43fe43956cf81e0592065150d7d5d5d67090608a6dc5d
curl -s -X POST \
  -H "Content-Type: application/json" \
  -d $'{
    "input": {
      "quality": "balanced",
      "end_time": 40,
      "video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
      "num_scenes": 4,
      "skip_intro": false,
      "start_time": 10,
      "target_fps": 24,
      "autocaption": true,
      "preview_only": false,
      "trigger_word": "RICKROLL",
      "caption_style": "detailed",
      "detection_mode": "content",
      "max_scene_length": 10,
      "min_scene_length": 1,
      "autocaption_prefix": "a video of RICKROLL, "
    }
  }' \
  http://localhost:5000/predictions

To learn more, take a look at the Cog documentation.

Output

processed_videos_20250117_192541.zip

{
  "completed_at": "2025-01-17T19:25:41.870599Z",
  "created_at": "2025-01-17T19:25:01.405000Z",
  "data_removed": false,
  "error": null,
  "id": "w73db9erbnrme0cmepwvs7mh6c",
  "input": {
    "duration": 10,
    "end_time": 40,
    "video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "start_time": 10,
    "autocaption": true,
    "num_segments": 3,
    "trigger_word": "RICKROLL",
    "caption_prompt": "Describe this video clip briefly, focusing on the main action and visual elements.",
    "autocaption_prefix": "a video of RICKROLL, "
  },
  "logs": "📥 Downloading video from: https://www.youtube.com/watch?v=dQw4w9WgXcQ\n[youtube] Extracting URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ\n[youtube] dQw4w9WgXcQ: Downloading webpage\n[youtube] dQw4w9WgXcQ: Downloading tv player API JSON\n[youtube] dQw4w9WgXcQ: Downloading ios player API JSON\n[youtube] dQw4w9WgXcQ: Downloading player 6e1dd460\n[youtube] dQw4w9WgXcQ: Downloading m3u8 information\n[info] dQw4w9WgXcQ: Downloading 1 format(s): 18\n[download] Destination: /tmp/video_processing/videos/Rick Astley - Never Gonna Give You Up (Official Music Video).mp4\n[download]   0.0% of    8.68MiB at  Unknown B/s ETA Unknown\nDownloading... 0%\n[download]   0.0% of    8.68MiB at  Unknown B/s ETA Unknown\nDownloading... 0%\n[download]   0.1% of    8.68MiB at    1.65MiB/s ETA 00:05  \n[download]   0.2% of    8.68MiB at    3.35MiB/s ETA 00:02\n[download]   0.3% of    8.68MiB at    6.58MiB/s ETA 00:01\n[download]   0.7% of    8.68MiB at    6.38MiB/s ETA 00:01\n[download]   1.4% of    8.68MiB at    9.24MiB/s ETA 00:00\n[download]   2.9% of    8.68MiB at   13.67MiB/s ETA 00:00\n[download]   5.7% of    8.68MiB at   21.90MiB/s ETA 00:00\n[download]  11.5% of    8.68MiB at   36.51MiB/s ETA 00:00\n[download]  23.0% of    8.68MiB at   61.87MiB/s ETA 00:00\n[download]  46.1% of    8.68MiB at   95.07MiB/s ETA 00:00\n[download]  92.2% of    8.68MiB at  147.88MiB/s ETA 00:00\n[download] 100.0% of    8.68MiB at  155.89MiB/s ETA 00:00\nDownloading... 100%\n[download] 100% of    8.68MiB in 00:00:00 at 99.17MiB/s\nDownload completed ✓\n📝 Renamed file to: rickastley_nevergonnagiveyouupofficialmusicvideo.mp4\n✂️ Splitting video from 10.0s to 40.0s\nCreating 3 segments of 10.0s each\n✓ Created segment 1/3 (10.0s to 20.0s)\n✓ Created segment 2/3 (20.0s to 30.0s)\n✓ Created segment 3/3 (30.0s to 40.0s)\n🎬 Processing segment 1/3\n🤖 Generating caption using AI...\nqwen-vl-utils using torchvision to read video.\n📝 Caption for segment 1:\n--------------------\na video of RICKROLL, RICKROLL The video features a man and a woman dancing in different locations, including a room with a brick wall and a white wall. The man is wearing a blue shirt and blue jeans, while the woman is wearing a white dress. The video also shows a man in a white jacket and black shirt dancing in a room with a brick wall.\n--------------------\n🎬 Processing segment 2/3\n🤖 Generating caption using AI...\n📝 Caption for segment 2:\n--------------------\na video of RICKROLL, RICKROLL The video features a man in a white jacket who is dancing and singing in front of a brick wall. He is later joined by another man in a blue shirt who also starts dancing and singing. The video captures the energy and movement of the two men as they perform together.\n--------------------\n🎬 Processing segment 3/3\n🤖 Generating caption using AI...\n📝 Caption for segment 3:\n--------------------\na video of RICKROLL, RICKROLL The video features a man wearing a blue shirt and sunglasses who is dancing and singing in various locations, including a tunnel and in front of a white brick wall. The man is also seen throwing a ball in the air.\n--------------------\n📦 Creating zip file...\n📋 Zip contents:\n--------------------\nSize  Name\n----  ----\n698.8K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg01.mp4\n0.3K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg01.txt\n616.3K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg02.mp4\n0.3K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg02.txt\n791.1K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg03.mp4\n0.2K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg03.txt\n--------------------\n✨ Success! Output saved to: processed_videos_20250117_192541.zip",
  "metrics": {
    "predict_time": 10.071517778,
    "total_time": 40.465599
  },
  "output": "https://replicate.delivery/xezq/SD8p8cPVF25xBJYEvtBX0HCewWuWf83KsBDagH6h5kfrFMMoA/processed_videos_20250117_192541.zip",
  "started_at": "2025-01-17T19:25:31.799081Z",
  "status": "succeeded",
  "urls": {
    "stream": "https://stream.replicate.com/v1/files/bcwr-gb2rtmkkvjbibtbucd65a3ocvtfjpkij33xnhy33on27ufqbxcta",
    "get": "https://api.replicate.com/v1/predictions/w73db9erbnrme0cmepwvs7mh6c",
    "cancel": "https://api.replicate.com/v1/predictions/w73db9erbnrme0cmepwvs7mh6c/cancel"
  },
  "version": "22f4990cae3179579e4915927f1b7defcd84fc1df9bd6ab21d39536b3a61e6ce"
}

Generated in

10.1 seconds

Tweak it ShareReport View full prediction

📥 Downloading video from: https://www.youtube.com/watch?v=dQw4w9WgXcQ
[youtube] Extracting URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ
[youtube] dQw4w9WgXcQ: Downloading webpage
[youtube] dQw4w9WgXcQ: Downloading tv player API JSON
[youtube] dQw4w9WgXcQ: Downloading ios player API JSON
[youtube] dQw4w9WgXcQ: Downloading player 6e1dd460
[youtube] dQw4w9WgXcQ: Downloading m3u8 information
[info] dQw4w9WgXcQ: Downloading 1 format(s): 18
[download] Destination: /tmp/video_processing/videos/Rick Astley - Never Gonna Give You Up (Official Music Video).mp4
[download]   0.0% of    8.68MiB at  Unknown B/s ETA Unknown
Downloading... 0%
[download]   0.0% of    8.68MiB at  Unknown B/s ETA Unknown
Downloading... 0%
[download]   0.1% of    8.68MiB at    1.65MiB/s ETA 00:05  
[download]   0.2% of    8.68MiB at    3.35MiB/s ETA 00:02
[download]   0.3% of    8.68MiB at    6.58MiB/s ETA 00:01
[download]   0.7% of    8.68MiB at    6.38MiB/s ETA 00:01
[download]   1.4% of    8.68MiB at    9.24MiB/s ETA 00:00
[download]   2.9% of    8.68MiB at   13.67MiB/s ETA 00:00
[download]   5.7% of    8.68MiB at   21.90MiB/s ETA 00:00
[download]  11.5% of    8.68MiB at   36.51MiB/s ETA 00:00
[download]  23.0% of    8.68MiB at   61.87MiB/s ETA 00:00
[download]  46.1% of    8.68MiB at   95.07MiB/s ETA 00:00
[download]  92.2% of    8.68MiB at  147.88MiB/s ETA 00:00
[download] 100.0% of    8.68MiB at  155.89MiB/s ETA 00:00
Downloading... 100%
[download] 100% of    8.68MiB in 00:00:00 at 99.17MiB/s
Download completed ✓
📝 Renamed file to: rickastley_nevergonnagiveyouupofficialmusicvideo.mp4
✂️ Splitting video from 10.0s to 40.0s
Creating 3 segments of 10.0s each
✓ Created segment 1/3 (10.0s to 20.0s)
✓ Created segment 2/3 (20.0s to 30.0s)
✓ Created segment 3/3 (30.0s to 40.0s)
🎬 Processing segment 1/3
🤖 Generating caption using AI...
qwen-vl-utils using torchvision to read video.
📝 Caption for segment 1:
--------------------
a video of RICKROLL, RICKROLL The video features a man and a woman dancing in different locations, including a room with a brick wall and a white wall. The man is wearing a blue shirt and blue jeans, while the woman is wearing a white dress. The video also shows a man in a white jacket and black shirt dancing in a room with a brick wall.
--------------------
🎬 Processing segment 2/3
🤖 Generating caption using AI...
📝 Caption for segment 2:
--------------------
a video of RICKROLL, RICKROLL The video features a man in a white jacket who is dancing and singing in front of a brick wall. He is later joined by another man in a blue shirt who also starts dancing and singing. The video captures the energy and movement of the two men as they perform together.
--------------------
🎬 Processing segment 3/3
🤖 Generating caption using AI...
📝 Caption for segment 3:
--------------------
a video of RICKROLL, RICKROLL The video features a man wearing a blue shirt and sunglasses who is dancing and singing in various locations, including a tunnel and in front of a white brick wall. The man is also seen throwing a ball in the air.
--------------------
📦 Creating zip file...
📋 Zip contents:
--------------------
Size  Name
----  ----
698.8K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg01.mp4
0.3K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg01.txt
616.3K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg02.mp4
0.3K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg02.txt
791.1K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg03.mp4
0.2K  videos/rickastley_nevergonnagiveyouupofficialmusicvideo_seg03.txt
--------------------
✨ Success! Output saved to: processed_videos_20250117_192541.zip

This output was created using a different version of the model, zsxkib/create-video-dataset:22f4990c.

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Create Video Dataset

A tool to easily prepare video datasets with automatic captioning for AI training. This tool processes videos (from URLs or local files), generates high-quality captions using QWEN-VL, and packages everything into a training-ready format.

Features

🎥 Process YouTube URLs or local video files
🤖 Automatic video captioning using QWEN-VL
✍️ Support for custom captions
🏷️ Configurable trigger words for training
📝 Prefix/suffix support for caption formatting
🗃️ Clean output in zip format

Input Parameters

Parameter	Description	Default
`video_url`	YouTube/video URL to process	None
`video_file`	Local video file to process	None
`trigger_word`	Training trigger word (e.g., TOK, STYLE3D)	“TOK”
`autocaption`	Use AI to generate captions	True
`custom_caption`	Your custom caption (required if autocaption=False)	None
`autocaption_prefix`	Text to add before captions	None
`autocaption_suffix`	Text to add after captions	None

Output

The tool produces a zip file containing: - Processed video file - Caption files (.txt) for each video - Proper directory structure for training