jagilley/stable-diffusion-depth2img | Run with an API on Replicate

Input

prompt

string

Shift + Return to add a new line

wanderer above a cyberpunk city, 4k digital art rendering by caspar david friedrichwanderer above a cyberpunk city, 4k digital art rendering by caspar david friedrich

The prompt to guide the image generation.

Default: "Wanderer above the sea of fog, digital art"

negative_prompt

string

Shift + Return to add a new line

Keywords to exclude from the resulting image

input_image

*file

Preview

Input image to be used as the starting point

prompt_strength

number

Prompt strength when providing the image. 1.0 corresponds to full destruction of information in init image.

Default: 0.8

num_outputs

integer

(minimum: 1, maximum: 8)

Number of images to generate

Default: 1

num_inference_steps

integer

(minimum: 1, maximum: 500)

The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.

Default: 50

guidance_scale

number

(minimum: 1, maximum: 20)

Scale for classifier-free guidance. Higher guidance scale encourages to generate images that are closely linked to the text prompt, usually at the expense of lower image quality.

Default: 7.5

scheduler

string

Choose a scheduler

Default: "DPMSolverMultistep"

seed

integer

Random seed. Leave blank to randomize the seed

depth_image

file

Depth image (optional). Specifies the depth of each pixel in the input image.

Run this model in Node.js with one line of code:

npx create-replicate --model=jagilley/stable-diffusion-depth2img

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";
import fs from "node:fs";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run jagilley/stable-diffusion-depth2img using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "jagilley/stable-diffusion-depth2img:68f699d395bc7c17008283a7cef6d92edc832d8dc59eb41a6cafec7fc70b85bc",
  {
    input: {
      seed: -1,
      prompt: "wanderer above a cyberpunk city, 4k digital art rendering by caspar david friedrich",
      scheduler: "DPMSolverMultistep",
      input_image: "https://replicate.delivery/pbxt/ICo443xcFQGIK4lawWN3ytMNDZsmZS6fZfjGYwFP6Dc5Vfnq/wanderer.jpeg",
      num_outputs: 1,
      guidance_scale: 7.5,
      prompt_strength: 0.8,
      num_inference_steps: 50
    }
  }
);

// To access the file URL:
console.log(output[0].url()); //=> "http://example.com"

// To write the file to disk:
fs.writeFile("my-image.png", output[0]);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run jagilley/stable-diffusion-depth2img using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "jagilley/stable-diffusion-depth2img:68f699d395bc7c17008283a7cef6d92edc832d8dc59eb41a6cafec7fc70b85bc",
    input={
        "seed": -1,
        "prompt": "wanderer above a cyberpunk city, 4k digital art rendering by caspar david friedrich",
        "scheduler": "DPMSolverMultistep",
        "input_image": "https://replicate.delivery/pbxt/ICo443xcFQGIK4lawWN3ytMNDZsmZS6fZfjGYwFP6Dc5Vfnq/wanderer.jpeg",
        "num_outputs": 1,
        "guidance_scale": 7.5,
        "prompt_strength": 0.8,
        "num_inference_steps": 50
    }
)

# To access the file URL:
print(output[0].url())
#=> "http://example.com"

# To write the file to disk:
with open("my-image.png", "wb") as file:
    file.write(output[0].read())

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run jagilley/stable-diffusion-depth2img using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "jagilley/stable-diffusion-depth2img:68f699d395bc7c17008283a7cef6d92edc832d8dc59eb41a6cafec7fc70b85bc",
    "input": {
      "seed": -1,
      "prompt": "wanderer above a cyberpunk city, 4k digital art rendering by caspar david friedrich",
      "scheduler": "DPMSolverMultistep",
      "input_image": "https://replicate.delivery/pbxt/ICo443xcFQGIK4lawWN3ytMNDZsmZS6fZfjGYwFP6Dc5Vfnq/wanderer.jpeg",
      "num_outputs": 1,
      "guidance_scale": 7.5,
      "prompt_strength": 0.8,
      "num_inference_steps": 50
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

{
  "completed_at": "2023-01-26T20:01:57.999236Z",
  "created_at": "2023-01-26T20:01:36.420191Z",
  "data_removed": false,
  "error": null,
  "id": "xadvta5lnreybcutg6njr6lulq",
  "input": {
    "seed": -1,
    "prompt": "wanderer above a cyberpunk city, 4k digital art rendering by caspar david friedrich",
    "scheduler": "DPMSolverMultistep",
    "input_image": "https://replicate.delivery/pbxt/ICo443xcFQGIK4lawWN3ytMNDZsmZS6fZfjGYwFP6Dc5Vfnq/wanderer.jpeg",
    "num_outputs": 1,
    "guidance_scale": 7.5,
    "prompt_strength": 0.8,
    "num_inference_steps": 50
  },
  "logs": "Using seed: -1\n  0%|          | 0/40 [00:00<?, ?it/s]\n  2%|▎         | 1/40 [00:04<03:14,  4.99s/it]\n  8%|▊         | 3/40 [00:05<00:49,  1.35s/it]\n 12%|█▎        | 5/40 [00:05<00:24,  1.46it/s]\n 18%|█▊        | 7/40 [00:05<00:13,  2.37it/s]\n 22%|██▎       | 9/40 [00:05<00:08,  3.48it/s]\n 28%|██▊       | 11/40 [00:05<00:06,  4.78it/s]\n 32%|███▎      | 13/40 [00:05<00:04,  6.23it/s]\n 38%|███▊      | 15/40 [00:05<00:03,  7.75it/s]\n 42%|████▎     | 17/40 [00:06<00:02,  9.23it/s]\n 48%|████▊     | 19/40 [00:06<00:01, 10.56it/s]\n 52%|█████▎    | 21/40 [00:06<00:01, 11.78it/s]\n 57%|█████▊    | 23/40 [00:06<00:01, 12.76it/s]\n 62%|██████▎   | 25/40 [00:06<00:01, 13.49it/s]\n 68%|██████▊   | 27/40 [00:06<00:00, 14.10it/s]\n 72%|███████▎  | 29/40 [00:06<00:00, 14.60it/s]\n 78%|███████▊  | 31/40 [00:06<00:00, 14.96it/s]\n 82%|████████▎ | 33/40 [00:07<00:00, 15.19it/s]\n 88%|████████▊ | 35/40 [00:07<00:00, 15.31it/s]\n 92%|█████████▎| 37/40 [00:07<00:00, 15.46it/s]\n 98%|█████████▊| 39/40 [00:07<00:00, 15.50it/s]\n100%|██████████| 40/40 [00:07<00:00,  5.35it/s]",
  "metrics": {
    "predict_time": 11.80156,
    "total_time": 21.579045
  },
  "output": [
    "https://replicate.delivery/pbxt/7MKZB7KJhVbxL1DKSiDYQ7eInHCKYgBi9zPeKkJbjri14IYQA/out-0.png"
  ],
  "started_at": "2023-01-26T20:01:46.197676Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/xadvta5lnreybcutg6njr6lulq",
    "cancel": "https://api.replicate.com/v1/predictions/xadvta5lnreybcutg6njr6lulq/cancel"
  },
  "version": "d488378e301ffa295b360e6a34e0a504b772449c75f570a9f4e5089d05f660df"
}

Generated in

11.8 seconds

Tweak itReport View full prediction

This output was created using a different version of the model, jagilley/stable-diffusion-depth2img:d488378e.

Examples

View more examples

Run time and cost

This model costs approximately $0.092 to run on Replicate, or 10 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 66 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Create variations of an image while preserving shape and depth.

This stable-diffusion-2-depth model is resumed from stable-diffusion-2-base (512-base-ema.ckpt) and finetuned for 200k steps. Added an extra input channel to process the (relative) depth prediction produced by MiDaS (dpt_hybrid) which is used as an additional conditioning.

Developed by: Robin Rombach, Patrick Esser
Model type: Diffusion-based text-to-image generation model
Language(s): English
License: CreativeML Open RAIL++-M License
Model Description: This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses a fixed, pretrained text encoder (OpenCLIP-ViT/H).
Resources for more information: GitHub Repository.