fofr/batch-image-captioning:d0adb15f – Run with an API on Replicate

You're looking at a specific version of this model. Jump to the model overview.

fofr /batch-image-captioning:d0adb15f

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
image_zip_archive	string		ZIP archive containing images to process
caption_prefix	string		Optional prefix for image captions
caption_suffix	string		Optional suffix for image captions
resize_images_for_captioning	boolean	True	Whether to resize images for captioning. This makes captioning cheaper
max_dimension	integer	1024	Maximum dimension (width or height) for resized images
model	string (enum)	gpt-4o-2024-08-06 Options: gpt-4o-2024-08-06, gpt-4o-mini, gpt-4o, gpt-4-turbo, claude-3-5-sonnet-20240620, claude-3-opus-20240229, claude-3-sonnet-20240229, claude-3-haiku-20240307, gemini-1.5-pro, gemini-1.5-flash	AI model to use for captioning. Your OpenAI or Anthropic account will be charged for usage, see their pricing pages for details.
openai_api_key	string		API key for OpenAI
anthropic_api_key	string		API key for Anthropic
google_generativeai_api_key	string		API key for Google Generative AI
system_prompt	string	Write a four sentence caption for this image. In the first sentence describe the style and type (painting, photo, etc) of the image. Describe in the remaining sentences the contents and composition of the image. Only use language that would be used to prompt a text to image model. Do not include usage. Comma separate keywords rather than using "or". Precise composition is important. Avoid phrases like "conveys a sense of" and "capturing the", just use the terms themselves. Good examples are: "Photo of an alien woman with a glowing halo standing on top of a mountain, wearing a white robe and silver mask in the futuristic style with futuristic design, sky background, soft lighting, dynamic pose, a sense of future technology, a science fiction movie scene rendered in the Unreal Engine." "A scene from the cartoon series Masters of the Universe depicts Man-At-Arms wearing a gray helmet and gray armor with red gloves. He is holding an iron bar above his head while looking down on Orko, a pink blob character. Orko is sitting behind Man-At-Arms facing left on a chair. Both characters are standing near each other, with Orko inside a yellow chestplate over a blue shirt and black pants. The scene is drawn in the style of the Masters of the Universe cartoon series." "An emoji, digital illustration, playful, whimsical. A cartoon zombie character with green skin and tattered clothes reaches forward with two hands, they have green skin, messy hair, an open mouth and gaping teeth, one eye is half closed."	System prompt for image analysis
message_prompt	string	Caption this image please	Message prompt for image captioning

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema

{'format': 'uri', 'title': 'Output', 'type': 'string'}