cjwbw / t2i-adapter

Learning Adapters towards Controllable for Text-to-Image Diffusion Models

  • Public
  • 3.9K runs
  • A100 (80GB)
  • GitHub
  • Paper
  • License
Iterate in playground

Input

image
string
Shift + Return to add a new line

Input prompt

Default: "A car with flying wings"

string
Shift + Return to add a new line

Specify things to not see in the output

Default: "ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, bad anatomy, watermark, signature, cut off, low contrast, underexposed, overexposed, bad art, beginner, amateur, distorted face"

*file

Input image

string

Choose a model.

Default: "sd-v1-4"

string

Choose type of your input. When image is chosen, output will be the extracted sketch and the generated images.

Default: "image"

boolean

Use plms sampling if set to True.

Default: true

boolean

Use se dpm_solver sampling if set to True.

Default: false

integer

Width of output image. Lower the width if out of memory.

Default: 512

integer

Height of output image. Lower the height if out of memory.

Default: 512

integer
(minimum: 1, maximum: 4)

Number of images to output.

Default: 1

integer
(minimum: 1, maximum: 500)

Number of denoising steps

Default: 50

number
(minimum: 1, maximum: 20)

Scale for classifier-free guidance

Default: 7.5

Output

output
Generated in

Run time and cost

This model costs approximately $0.037 to run on Replicate, or 27 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 27 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Official implementation of T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models.

We propose T2I-Adapter, a simple and small (~70M parameters, ~300M storage space) network that can provide extra guidance to pre-trained text-to-image models while freezing the original large text-to-image models.

T2I-Adapter aligns internal knowledge in T2I models with external control signals. We can train various adapters according to different conditions, and achieve rich control and editing effects.