methexis-inc / img2prompt

Get an approximate text prompt, with style, matching an image. (Optimized for stable-diffusion (clip ViT-L/14))

  • Public
  • 2.6M runs
  • GitHub
  • License

Run time and cost

This model costs approximately $0.0051 to run on Replicate, or 196 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 23 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Model description

Provides approximate text prompts that can be used with stable diffusion to re-create similar looking versions of the image/painting. Try it by copying the text prompts to stable diffusion!

A slightly adapted version of the CLIP Interrogator notebook by @pharmapsychotic.

If this notebook is helpful to you please consider buying @pharmapsychotic a coffee via ko-fi or following @AIMindFlow and @pharmapsychotic on twitter for more cool Ai stuff.

The CLIP Interrogator uses the OpenAI CLIP models to test a given image against a variety of artists, mediums, and styles to study how the different models see the content of the image. It also combines the results with BLIP caption to suggest a text prompt to create more images similar to what was given.