lucataco / clip-interrogator

CLIP Interrogator (for faster inference)

  • Public
  • 122.2K runs
  • L40S
  • GitHub
  • Paper
  • License

Input

image
*file

Input image

string

Choose ViT-L for Stable Diffusion 1, ViT-H for Stable Diffusion 2, or ViT-bigG for Stable Diffusion XL.

Default: "ViT-L-14/openai"

string

Prompt mode (best takes 10-20 seconds, fast takes 1-2 seconds).

Default: "best"

Output

painting of a turtle swimming in the ocean with a blue sky in the background, illustrative art, turtle, michael angelo inspired, world-bearing turtle, highly detailed illustration.”, 4k artwork, realistic illustration, highly detailed digital painting, vibrant digital painting, [ 4 k digital art, 4k art, hypperrealistic illustration, high detail illustration, vibrant realistic
Generated in

Run time and cost

This model costs approximately $0.030 to run on Replicate, or 33 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 32 seconds. The predict time for this model varies significantly based on the inputs.

Readme

This is an attempt to replicate the model pharmapsychotic/clip-interrogator to run on an A40 GPU for faster inference times

The CLIP Interrogator is a prompt engineering tool that combines OpenAI’s CLIP and Salesforce’s BLIP to optimize text prompts to match a given image. Use the resulting prompts with text-to-image models like Stable Diffusion to create cool art

Based on the original cog: https://replicate.com/pharmapsychotic/clip-interrogator

Give me a follow if you like my work! @lucataco93