cjwbw / prompt-free-diffusion

Prompt-free Diffusion

Cold

Public
749 runs
T4
GitHub
Paper
License

Iterate in playground

Run with an API

Playground API Examples README Versions

Input

image

*file

Preview

Input image

control

*file

Preview

Control input

context_encoder

string

Choose a context encoder

Default: "SeeCoder"

tag_diffuser

string

Choose a diffusion model

Default: "Deliberate-v2.0"

preprocess_type

string

Choose a Preprocess Type

Default: "canny"

control_net

string

Choose ControlNet

Default: "canny"

out_width

integer

(minimum: 512, maximum: 1536)

Width of output image. Reduce if hits the memory limit

Default: 512

out_height

integer

(minimum: 512, maximum: 1536)

Height of output image. Reduce if hits the memory limit

Default: 512

num_inference_steps

integer

(minimum: 1, maximum: 500)

Number of denoising steps

Default: 50

guidance_scale

number

(minimum: 0, maximum: 10)

Scale for classifier-free guidance

Default: 2

seed

integer

Random seed. Leave blank to randomize the seed

Run this model in Node.js with one line of code:

npx create-replicate --model=cjwbw/prompt-free-diffusion

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";
import fs from "node:fs";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run cjwbw/prompt-free-diffusion using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "cjwbw/prompt-free-diffusion:8ffe43fe7298a95554ea24047fb144b921ca1a85f661527b34b2c02b2573579f",
  {
    input: {
      image: "https://replicate.delivery/pbxt/IvpLPCeH4QTQomgDkJy4NHla7zk2lSz4Tdv6f9x6vywCsMTs/astronautridinghouse-input.jpg",
      control: "https://replicate.delivery/pbxt/IvpLP71kyA7Zz7BiLnjoskVpbCHnaflLHVMa6DNQWwvacF9u/astronautridinghouse-canny.png",
      out_width: 768,
      out_height: 512,
      control_net: "canny",
      tag_diffuser: "Deliberate-v2.0",
      guidance_scale: 2,
      context_encoder: "SeeCoder",
      preprocess_type: "canny",
      num_inference_steps: 50
    }
  }
);

// To access the file URL:
console.log(output.url()); //=> "http://example.com"

// To write the file to disk:
fs.writeFile("my-image.png", output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run cjwbw/prompt-free-diffusion using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "cjwbw/prompt-free-diffusion:8ffe43fe7298a95554ea24047fb144b921ca1a85f661527b34b2c02b2573579f",
    input={
        "image": "https://replicate.delivery/pbxt/IvpLPCeH4QTQomgDkJy4NHla7zk2lSz4Tdv6f9x6vywCsMTs/astronautridinghouse-input.jpg",
        "control": "https://replicate.delivery/pbxt/IvpLP71kyA7Zz7BiLnjoskVpbCHnaflLHVMa6DNQWwvacF9u/astronautridinghouse-canny.png",
        "out_width": 768,
        "out_height": 512,
        "control_net": "canny",
        "tag_diffuser": "Deliberate-v2.0",
        "guidance_scale": 2,
        "context_encoder": "SeeCoder",
        "preprocess_type": "canny",
        "num_inference_steps": 50
    }
)

# To access the file URL:
print(output.url())
#=> "http://example.com"

# To write the file to disk:
with open("my-image.png", "wb") as file:
    file.write(output.read())

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run cjwbw/prompt-free-diffusion using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "cjwbw/prompt-free-diffusion:8ffe43fe7298a95554ea24047fb144b921ca1a85f661527b34b2c02b2573579f",
    "input": {
      "image": "https://replicate.delivery/pbxt/IvpLPCeH4QTQomgDkJy4NHla7zk2lSz4Tdv6f9x6vywCsMTs/astronautridinghouse-input.jpg",
      "control": "https://replicate.delivery/pbxt/IvpLP71kyA7Zz7BiLnjoskVpbCHnaflLHVMa6DNQWwvacF9u/astronautridinghouse-canny.png",
      "out_width": 768,
      "out_height": 512,
      "control_net": "canny",
      "tag_diffuser": "Deliberate-v2.0",
      "guidance_scale": 2,
      "context_encoder": "SeeCoder",
      "preprocess_type": "canny",
      "num_inference_steps": 50
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

You can run this model locally using Cog. First, install Cog:

brew install cog

If you don’t have Homebrew, there are other installation options available.

Run this to download the model and run it in your local environment:

cog predict r8.im/chenxwh/prompt-free-diffusion@sha256:8ffe43fe7298a95554ea24047fb144b921ca1a85f661527b34b2c02b2573579f \
  -i 'image="https://replicate.delivery/pbxt/IvpLPCeH4QTQomgDkJy4NHla7zk2lSz4Tdv6f9x6vywCsMTs/astronautridinghouse-input.jpg"' \
  -i 'control="https://replicate.delivery/pbxt/IvpLP71kyA7Zz7BiLnjoskVpbCHnaflLHVMa6DNQWwvacF9u/astronautridinghouse-canny.png"' \
  -i 'out_width=768' \
  -i 'out_height=512' \
  -i 'control_net="canny"' \
  -i 'tag_diffuser="Deliberate-v2.0"' \
  -i 'guidance_scale=2' \
  -i 'context_encoder="SeeCoder"' \
  -i 'preprocess_type="canny"' \
  -i 'num_inference_steps=50'

To learn more, take a look at the Cog documentation.

Run this to download the model and run it in your local environment:

docker run -d -p 5000:5000 --gpus=all r8.im/chenxwh/prompt-free-diffusion@sha256:8ffe43fe7298a95554ea24047fb144b921ca1a85f661527b34b2c02b2573579f
curl -s -X POST \
  -H "Content-Type: application/json" \
  -d $'{
    "input": {
      "image": "https://replicate.delivery/pbxt/IvpLPCeH4QTQomgDkJy4NHla7zk2lSz4Tdv6f9x6vywCsMTs/astronautridinghouse-input.jpg",
      "control": "https://replicate.delivery/pbxt/IvpLP71kyA7Zz7BiLnjoskVpbCHnaflLHVMa6DNQWwvacF9u/astronautridinghouse-canny.png",
      "out_width": 768,
      "out_height": 512,
      "control_net": "canny",
      "tag_diffuser": "Deliberate-v2.0",
      "guidance_scale": 2,
      "context_encoder": "SeeCoder",
      "preprocess_type": "canny",
      "num_inference_steps": 50
    }
  }' \
  http://localhost:5000/predictions

To learn more, take a look at the Cog documentation.

Output

{
  "completed_at": "2023-06-02T18:26:39.600450Z",
  "created_at": "2023-06-02T18:16:12.528680Z",
  "data_removed": false,
  "error": null,
  "id": "65vmcz4r3jb35lsulawxcx56r4",
  "input": {
    "image": "https://replicate.delivery/pbxt/IvpLPCeH4QTQomgDkJy4NHla7zk2lSz4Tdv6f9x6vywCsMTs/astronautridinghouse-input.jpg",
    "control": "https://replicate.delivery/pbxt/IvpLP71kyA7Zz7BiLnjoskVpbCHnaflLHVMa6DNQWwvacF9u/astronautridinghouse-canny.png",
    "out_width": 768,
    "out_height": 512,
    "control_net": "canny",
    "tag_diffuser": "Deliberate-v2.0",
    "guidance_scale": 2,
    "context_encoder": "SeeCoder",
    "preprocess_type": "canny",
    "num_inference_steps": 50
  },
  "logs": "Using seed: 12119\n#######################\n# Running in eps mode #\n#######################\nmaking attention of type 'vanilla' with 512 in_channels\nWorking with z of shape (1, 4, 32, 32) = 4096 dimensions.\nmaking attention of type 'vanilla' with 512 in_channels\nLoad model from [pretrained/pfd/vae/sd-v2-0-base-autokl.pth] strict [True].\nLoad autoencoderkl with total 83653863 parameters,79145.299 parameter sum.\nLoad swin with total 195201204 parameters,44522.425 parameter sum.\nLoad seecoder_decoder with total 27783168 parameters,13929.608 parameter sum.\nLoad seecoder_query_transformer with total 71130624 parameters,21163.731 parameter sum.\nLoad seecoder with total 294114996 parameters,79615.763 parameter sum.\nLoad openai_unet_2d_next with total 859520964 parameters,100102.951 parameter sum.\nLoad controlnet with total 361279120 parameters,41433.391 parameter sum.\nLoad pfd_with_control with total 1598568943 parameters,300297.405 parameter sum.\nLoad context encoder from [pretrained/pfd/seecoder/seecoder-v1-0.safetensors] strict [True].\nLoad diffuser from [pretrained/pfd/diffuser/Deliberate-v2-0.safetensors] strict [True].\nLoad controlnet from [pretrained/controlnet/control_sd15_canny_slimmed.safetensors] strict [True].\n###################\n# Running in FP16 #\n###################\nData shape for DDIM sampling is [1, 4, 64, 96], eta 0.0\nDDIM Sampler:   0%|          | 0/50 [00:00<?, ?it/s]\nDDIM Sampler:   2%|▏         | 1/50 [00:00<00:33,  1.44it/s]\nDDIM Sampler:   4%|▍         | 2/50 [00:01<00:27,  1.73it/s]\nDDIM Sampler:   6%|▌         | 3/50 [00:01<00:25,  1.85it/s]\nDDIM Sampler:   8%|▊         | 4/50 [00:02<00:24,  1.92it/s]\nDDIM Sampler:  10%|█         | 5/50 [00:02<00:23,  1.95it/s]\nDDIM Sampler:  12%|█▏        | 6/50 [00:03<00:22,  1.97it/s]\nDDIM Sampler:  14%|█▍        | 7/50 [00:03<00:21,  1.98it/s]\nDDIM Sampler:  16%|█▌        | 8/50 [00:04<00:21,  1.99it/s]\nDDIM Sampler:  18%|█▊        | 9/50 [00:04<00:20,  2.00it/s]\nDDIM Sampler:  20%|██        | 10/50 [00:05<00:19,  2.00it/s]\nDDIM Sampler:  22%|██▏       | 11/50 [00:05<00:19,  2.00it/s]\nDDIM Sampler:  24%|██▍       | 12/50 [00:06<00:18,  2.01it/s]\nDDIM Sampler:  26%|██▌       | 13/50 [00:06<00:18,  2.01it/s]\nDDIM Sampler:  28%|██▊       | 14/50 [00:07<00:17,  2.01it/s]\nDDIM Sampler:  30%|███       | 15/50 [00:07<00:17,  2.01it/s]\nDDIM Sampler:  32%|███▏      | 16/50 [00:08<00:16,  2.01it/s]\nDDIM Sampler:  34%|███▍      | 17/50 [00:08<00:16,  2.00it/s]\nDDIM Sampler:  36%|███▌      | 18/50 [00:09<00:15,  2.00it/s]\nDDIM Sampler:  38%|███▊      | 19/50 [00:09<00:15,  2.01it/s]\nDDIM Sampler:  40%|████      | 20/50 [00:10<00:14,  2.01it/s]\nDDIM Sampler:  42%|████▏     | 21/50 [00:10<00:14,  2.01it/s]\nDDIM Sampler:  44%|████▍     | 22/50 [00:11<00:13,  2.01it/s]\nDDIM Sampler:  46%|████▌     | 23/50 [00:11<00:13,  2.01it/s]\nDDIM Sampler:  48%|████▊     | 24/50 [00:12<00:12,  2.01it/s]\nDDIM Sampler:  50%|█████     | 25/50 [00:12<00:12,  2.00it/s]\nDDIM Sampler:  52%|█████▏    | 26/50 [00:13<00:11,  2.00it/s]\nDDIM Sampler:  54%|█████▍    | 27/50 [00:13<00:11,  2.01it/s]\nDDIM Sampler:  56%|█████▌    | 28/50 [00:14<00:10,  2.00it/s]\nDDIM Sampler:  58%|█████▊    | 29/50 [00:14<00:10,  2.00it/s]\nDDIM Sampler:  60%|██████    | 30/50 [00:15<00:09,  2.00it/s]\nDDIM Sampler:  62%|██████▏   | 31/50 [00:15<00:09,  2.00it/s]\nDDIM Sampler:  64%|██████▍   | 32/50 [00:16<00:08,  2.00it/s]\nDDIM Sampler:  66%|██████▌   | 33/50 [00:16<00:08,  2.00it/s]\nDDIM Sampler:  68%|██████▊   | 34/50 [00:17<00:07,  2.00it/s]\nDDIM Sampler:  70%|███████   | 35/50 [00:17<00:07,  2.00it/s]\nDDIM Sampler:  72%|███████▏  | 36/50 [00:18<00:06,  2.01it/s]\nDDIM Sampler:  74%|███████▍  | 37/50 [00:18<00:06,  2.01it/s]\nDDIM Sampler:  76%|███████▌  | 38/50 [00:19<00:05,  2.01it/s]\nDDIM Sampler:  78%|███████▊  | 39/50 [00:19<00:05,  2.00it/s]\nDDIM Sampler:  80%|████████  | 40/50 [00:20<00:04,  2.00it/s]\nDDIM Sampler:  82%|████████▏ | 41/50 [00:20<00:04,  2.00it/s]\nDDIM Sampler:  84%|████████▍ | 42/50 [00:21<00:03,  2.00it/s]\nDDIM Sampler:  86%|████████▌ | 43/50 [00:21<00:03,  2.00it/s]\nDDIM Sampler:  88%|████████▊ | 44/50 [00:22<00:02,  2.00it/s]\nDDIM Sampler:  90%|█████████ | 45/50 [00:22<00:02,  2.00it/s]\nDDIM Sampler:  92%|█████████▏| 46/50 [00:23<00:01,  2.00it/s]\nDDIM Sampler:  94%|█████████▍| 47/50 [00:23<00:01,  2.00it/s]\nDDIM Sampler:  96%|█████████▌| 48/50 [00:24<00:01,  2.00it/s]\nDDIM Sampler:  98%|█████████▊| 49/50 [00:24<00:00,  2.00it/s]\nDDIM Sampler: 100%|██████████| 50/50 [00:25<00:00,  2.00it/s]\nDDIM Sampler: 100%|██████████| 50/50 [00:25<00:00,  1.99it/s]",
  "metrics": {
    "predict_time": 85.661872,
    "total_time": 627.07177
  },
  "output": "https://replicate.delivery/pbxt/J3nCFqcUvg4dD92f7KoduC9hZ0qhzntY9QBTrIygypcvMfBRA/out.png",
  "started_at": "2023-06-02T18:25:13.938578Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/65vmcz4r3jb35lsulawxcx56r4",
    "cancel": "https://api.replicate.com/v1/predictions/65vmcz4r3jb35lsulawxcx56r4/cancel"
  },
  "version": "8ffe43fe7298a95554ea24047fb144b921ca1a85f661527b34b2c02b2573579f"
}

Generated in

85.7 seconds

Tweak itReport View full prediction

Using seed: 12119
#######################
# Running in eps mode #
#######################
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Load model from [pretrained/pfd/vae/sd-v2-0-base-autokl.pth] strict [True].
Load autoencoderkl with total 83653863 parameters,79145.299 parameter sum.
Load swin with total 195201204 parameters,44522.425 parameter sum.
Load seecoder_decoder with total 27783168 parameters,13929.608 parameter sum.
Load seecoder_query_transformer with total 71130624 parameters,21163.731 parameter sum.
Load seecoder with total 294114996 parameters,79615.763 parameter sum.
Load openai_unet_2d_next with total 859520964 parameters,100102.951 parameter sum.
Load controlnet with total 361279120 parameters,41433.391 parameter sum.
Load pfd_with_control with total 1598568943 parameters,300297.405 parameter sum.
Load context encoder from [pretrained/pfd/seecoder/seecoder-v1-0.safetensors] strict [True].
Load diffuser from [pretrained/pfd/diffuser/Deliberate-v2-0.safetensors] strict [True].
Load controlnet from [pretrained/controlnet/control_sd15_canny_slimmed.safetensors] strict [True].
###################
# Running in FP16 #
###################
Data shape for DDIM sampling is [1, 4, 64, 96], eta 0.0
DDIM Sampler:   0%|          | 0/50 [00:00<?, ?it/s]
DDIM Sampler:   2%|▏         | 1/50 [00:00<00:33,  1.44it/s]
DDIM Sampler:   4%|▍         | 2/50 [00:01<00:27,  1.73it/s]
DDIM Sampler:   6%|▌         | 3/50 [00:01<00:25,  1.85it/s]
DDIM Sampler:   8%|▊         | 4/50 [00:02<00:24,  1.92it/s]
DDIM Sampler:  10%|█         | 5/50 [00:02<00:23,  1.95it/s]
DDIM Sampler:  12%|█▏        | 6/50 [00:03<00:22,  1.97it/s]
DDIM Sampler:  14%|█▍        | 7/50 [00:03<00:21,  1.98it/s]
DDIM Sampler:  16%|█▌        | 8/50 [00:04<00:21,  1.99it/s]
DDIM Sampler:  18%|█▊        | 9/50 [00:04<00:20,  2.00it/s]
DDIM Sampler:  20%|██        | 10/50 [00:05<00:19,  2.00it/s]
DDIM Sampler:  22%|██▏       | 11/50 [00:05<00:19,  2.00it/s]
DDIM Sampler:  24%|██▍       | 12/50 [00:06<00:18,  2.01it/s]
DDIM Sampler:  26%|██▌       | 13/50 [00:06<00:18,  2.01it/s]
DDIM Sampler:  28%|██▊       | 14/50 [00:07<00:17,  2.01it/s]
DDIM Sampler:  30%|███       | 15/50 [00:07<00:17,  2.01it/s]
DDIM Sampler:  32%|███▏      | 16/50 [00:08<00:16,  2.01it/s]
DDIM Sampler:  34%|███▍      | 17/50 [00:08<00:16,  2.00it/s]
DDIM Sampler:  36%|███▌      | 18/50 [00:09<00:15,  2.00it/s]
DDIM Sampler:  38%|███▊      | 19/50 [00:09<00:15,  2.01it/s]
DDIM Sampler:  40%|████      | 20/50 [00:10<00:14,  2.01it/s]
DDIM Sampler:  42%|████▏     | 21/50 [00:10<00:14,  2.01it/s]
DDIM Sampler:  44%|████▍     | 22/50 [00:11<00:13,  2.01it/s]
DDIM Sampler:  46%|████▌     | 23/50 [00:11<00:13,  2.01it/s]
DDIM Sampler:  48%|████▊     | 24/50 [00:12<00:12,  2.01it/s]
DDIM Sampler:  50%|█████     | 25/50 [00:12<00:12,  2.00it/s]
DDIM Sampler:  52%|█████▏    | 26/50 [00:13<00:11,  2.00it/s]
DDIM Sampler:  54%|█████▍    | 27/50 [00:13<00:11,  2.01it/s]
DDIM Sampler:  56%|█████▌    | 28/50 [00:14<00:10,  2.00it/s]
DDIM Sampler:  58%|█████▊    | 29/50 [00:14<00:10,  2.00it/s]
DDIM Sampler:  60%|██████    | 30/50 [00:15<00:09,  2.00it/s]
DDIM Sampler:  62%|██████▏   | 31/50 [00:15<00:09,  2.00it/s]
DDIM Sampler:  64%|██████▍   | 32/50 [00:16<00:08,  2.00it/s]
DDIM Sampler:  66%|██████▌   | 33/50 [00:16<00:08,  2.00it/s]
DDIM Sampler:  68%|██████▊   | 34/50 [00:17<00:07,  2.00it/s]
DDIM Sampler:  70%|███████   | 35/50 [00:17<00:07,  2.00it/s]
DDIM Sampler:  72%|███████▏  | 36/50 [00:18<00:06,  2.01it/s]
DDIM Sampler:  74%|███████▍  | 37/50 [00:18<00:06,  2.01it/s]
DDIM Sampler:  76%|███████▌  | 38/50 [00:19<00:05,  2.01it/s]
DDIM Sampler:  78%|███████▊  | 39/50 [00:19<00:05,  2.00it/s]
DDIM Sampler:  80%|████████  | 40/50 [00:20<00:04,  2.00it/s]
DDIM Sampler:  82%|████████▏ | 41/50 [00:20<00:04,  2.00it/s]
DDIM Sampler:  84%|████████▍ | 42/50 [00:21<00:03,  2.00it/s]
DDIM Sampler:  86%|████████▌ | 43/50 [00:21<00:03,  2.00it/s]
DDIM Sampler:  88%|████████▊ | 44/50 [00:22<00:02,  2.00it/s]
DDIM Sampler:  90%|█████████ | 45/50 [00:22<00:02,  2.00it/s]
DDIM Sampler:  92%|█████████▏| 46/50 [00:23<00:01,  2.00it/s]
DDIM Sampler:  94%|█████████▍| 47/50 [00:23<00:01,  2.00it/s]
DDIM Sampler:  96%|█████████▌| 48/50 [00:24<00:01,  2.00it/s]
DDIM Sampler:  98%|█████████▊| 49/50 [00:24<00:00,  2.00it/s]
DDIM Sampler: 100%|██████████| 50/50 [00:25<00:00,  2.00it/s]
DDIM Sampler: 100%|██████████| 50/50 [00:25<00:00,  1.99it/s]

Examples

View more examples

Run time and cost

This model costs approximately $0.069 to run on Replicate, or 14 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 6 minutes. The predict time for this model varies significantly based on the inputs.

Readme

Prompt-Free Diffusion

Introduction

Prompt-Free Diffusion is a diffusion model that relys on only visual inputs to generate new images, handled by Semantic Context Encoder (SeeCoder) by substituting the commonly used CLIP-based text encoder. SeeCoder is reusable to most public T2I models as well as adaptive layers like ControlNet, LoRA, T2I-Adapter, etc. Just drop in and play!

Performance

Network

Citation

@article{xu2023prompt,
  title={Prompt-Free Diffusion: Taking" Text" out of Text-to-Image Diffusion Models},
  author={Xu, Xingqian and Guo, Jiayi and Wang, Zhangyang and Huang, Gao and Essa, Irfan and Shi, Humphrey},
  journal={arXiv preprint arXiv:2305.16223},
  year={2023}
}