joyanujoy / analog_diffusion

Some scrappy experiments 🫣

  • Public
  • 719 runs
  • A100 (80GB)
  • GitHub
  • License
Iterate in playground
Run with an API

Input

string
Shift + Return to add a new line

Input prompt. Use <1>, <2>, <3>, etc., to specify LoRA concepts

Default: "a photo of <1> riding a horse on mars"

string
Shift + Return to add a new line

Specify things to not see in the output

Default: ""

integer

Width of output image. Maximum size is 1024x768 or 768x1024 because of memory limits

Default: 512

integer

Height of output image. Maximum size is 1024x768 or 768x1024 because of memory limits

Default: 512

integer
(minimum: 1, maximum: 4)

Number of images to output.

Default: 1

integer
(minimum: 1, maximum: 500)

Number of denoising steps

Default: 50

number
(minimum: 1, maximum: 20)

Scale for classifier-free guidance

Default: 7.5

file

(Img2Img) Inital image to generate variations of. If this is not none, Img2Img will be invoked.

boolean

Automatically remove background from above Img2Img initial image.

Default: true

boolean

Generate prompt from init image. Generated prompt will override manually entered text prompt input

Default: true

number

(Img2Img) Prompt strength when providing the image. 1.0 corresponds to full destruction of information in init image

Default: 0.8

string

Choose a scheduler.

Default: "DPMSolverMultistep"

string
Shift + Return to add a new line

List of urls for safetensors of lora models, seperated with | .

Default: ""

string
Shift + Return to add a new line

List of scales for safetensors of lora models, seperated with |

Default: "0.5"

integer

Random seed. Leave blank to randomize the seed

file

(T2I-adapter) Adapter Condition Image to gain extra control over generation. If this is not none, T2I adapter will be invoked.

string

(T2I-adapter) Choose an adapter type for the additional condition.

Default: "sketch"

integer

Upscaling factor

boolean

Return a response object with detail info. Prompt, seed used etc.

Default: false

Output

output
Generated in

Run time and cost

This model costs approximately $0.012 to run on Replicate, or 83 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 9 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Model description

In your prompt, use the activation token: analog style

Based on wavymulder/Analog-Diffusion. Replicate deployment is adapted from the original lore-inference repo by cloneofsimo. I’m using this for running experiments easily without having to spin up a GPU server for automatic111 webui.

Main modifications are:

  • Add RealESRGAN for upscaling
  • Add Clip Interrogator for auto generating initial img2img prompt
  • Add rembg for background removal from init image.
  • Automatically resize adaptor condition image size to match output image.
  • Modify replicate output schema to return additonal info - seed, clip interrogator generated prompt etc.[API works but breaks replicate web UI explorer]
  • Modify the scripts for download weights, test and deploy to download and cache large model files. Bake the cache files into container image to avoid long wait for model downloads everytime container warms up.

…

Caveats and recommendations

  1. Set verbose_response: false in replicate web ui. Setting this to true this breaks the web ui.
  2. For viewing the output images in replicate web, use the version eff6035c
  3. To receive verbose response with additional fields - clip interrogator generated prompt, seed used etc. use the version 1924c521

…