pipeline-examples/batch-image-captioner

Batch caption images for LoRA trainings (or other things)

Public
4 runs

Readme

batch-image-captioner

A simple tool that generates detailed captions for multiple images at once using Claude 3.5 Sonnet.

https://replicate.com/pipeline-examples/batch-image-captioner

Features

  • Process multiple images in a single batch via a ZIP file
  • Generate detailed, structured captions for each image
  • Add optional custom prefixes and suffixes to captions
  • Customize the system prompt to control caption style and content
  • Returns all images with their captions in a convenient ZIP archive

Models

Under the hood it uses these models:

How it works

The tool takes a ZIP file containing images, extracts them, and sends each image to Claude 3.5 Sonnet for captioning. By default, it asks Claude to create a four-sentence caption describing the image style, contents, and composition in language suitable for text-to-image prompting. Each image is processed in parallel, and the results are packaged into a ZIP file containing both the original images and text files with their respective captions.

You can customize the captions by modifying the system prompt or adding prefixes/suffixes to all generated captions. This makes it especially useful for creating training data or generating prompts for text-to-image models.