methexis-inc/ img2prompt

Get an approximate text prompt, with style, matching an image. (Optimized for stable-diffusion (clip ViT-L/14))

Run time and cost

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 31 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Model description

Provides approximate text prompts that can be used with stable diffusion to re-create similar looking versions of the image/painting. Try it by copying the text prompts to stable diffusion!

A slightly adapted version of the CLIP Interrogator notebook by @pharmapsychotic.

If this notebook is helpful to you please consider buying @pharmapsychotic a coffee via ko-fi or following @AIMindFlow and @pharmapsychotic on twitter for more cool Ai stuff.

The CLIP Interrogator uses the OpenAI CLIP models to test a given image against a variety of artists, mediums, and styles to study how the different models see the content of the image. It also combines the results with BLIP caption to suggest a text prompt to create more images similar to what was given.