Input

prompt

string

Shift + Return to add a new line

A diecast toy car of this, it is photographed in a toy box with the word "Waymo", the box is on a shop shelf selling for $5.99A diecast toy car of this, it is photographed in a toy box with the word "Waymo", the box is on a shop shelf selling for $5.99

Text prompt

task

string

OminiControl task

Default: "subject_1024"

control_image

file

Control image.

num_outputs

integer

(minimum: 1, maximum: 4)

Number of images to output.

Default: 1

num_inference_steps

integer

(minimum: 1, maximum: 50)

Number of inference steps

Default: 8

guidance_scale

number

(minimum: 0, maximum: 10)

Guidance scale for the diffusion process

Default: 3.5

seed

integer

Random seed. Set for reproducible generation

output_format

string

Format of the output images

Default: "webp"

output_quality

integer

(minimum: 0, maximum: 100)

Quality when saving the output images, from 0 to 100. 100 is best quality, 0 is lowest quality. Not relevant for .png outputs

Default: 80

Run this model in Node.js with one line of code:

npx create-replicate --model=jhorovitz/omini-schnell

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";
import fs from "node:fs";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run jhorovitz/omini-schnell using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "jhorovitz/omini-schnell:c07c96d92cc1c2736cc8c17115fe245338a224bb2d9e08c25d6f5173a7ce83eb",
  {
    input: {
      task: "subject_1024",
      prompt: "A diecast toy car of this, it is photographed in a toy box with the word \"Waymo\", the box is on a shop shelf selling for $5.99",
      num_outputs: 4,
      control_image: "https://replicate.delivery/xezq/Za0hrhIO1PboAlVUVyfnqfkOX4y0gh6olDHcngJC9ifNIsgoA/tmpt_61zuhm.jpg",
      output_format: "webp",
      guidance_scale: 3.5,
      output_quality: 80,
      num_inference_steps: 8
    }
  }
);

// To access the file URL:
console.log(output[0].url()); //=> "http://example.com"

// To write the file to disk:
fs.writeFile("my-image.png", output[0]);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run jhorovitz/omini-schnell using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "jhorovitz/omini-schnell:c07c96d92cc1c2736cc8c17115fe245338a224bb2d9e08c25d6f5173a7ce83eb",
    input={
        "task": "subject_1024",
        "prompt": "A diecast toy car of this, it is photographed in a toy box with the word \"Waymo\", the box is on a shop shelf selling for $5.99",
        "num_outputs": 4,
        "control_image": "https://replicate.delivery/xezq/Za0hrhIO1PboAlVUVyfnqfkOX4y0gh6olDHcngJC9ifNIsgoA/tmpt_61zuhm.jpg",
        "output_format": "webp",
        "guidance_scale": 3.5,
        "output_quality": 80,
        "num_inference_steps": 8
    }
)
print(output)

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run jhorovitz/omini-schnell using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "jhorovitz/omini-schnell:c07c96d92cc1c2736cc8c17115fe245338a224bb2d9e08c25d6f5173a7ce83eb",
    "input": {
      "task": "subject_1024",
      "prompt": "A diecast toy car of this, it is photographed in a toy box with the word \\"Waymo\\", the box is on a shop shelf selling for $5.99",
      "num_outputs": 4,
      "control_image": "https://replicate.delivery/xezq/Za0hrhIO1PboAlVUVyfnqfkOX4y0gh6olDHcngJC9ifNIsgoA/tmpt_61zuhm.jpg",
      "output_format": "webp",
      "guidance_scale": 3.5,
      "output_quality": 80,
      "num_inference_steps": 8
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

You can run this model locally using Cog. First, install Cog:

brew install cog

If you don’t have Homebrew, there are other installation options available.

Run this to download the model and run it in your local environment:

cog predict r8.im/jhorovitz/omini-schnell@sha256:c07c96d92cc1c2736cc8c17115fe245338a224bb2d9e08c25d6f5173a7ce83eb \
  -i 'task="subject_1024"' \
  -i $'prompt="A diecast toy car of this, it is photographed in a toy box with the word \\"Waymo\\", the box is on a shop shelf selling for $5.99"' \
  -i 'num_outputs=4' \
  -i 'control_image="https://replicate.delivery/xezq/Za0hrhIO1PboAlVUVyfnqfkOX4y0gh6olDHcngJC9ifNIsgoA/tmpt_61zuhm.jpg"' \
  -i 'output_format="webp"' \
  -i 'guidance_scale=3.5' \
  -i 'output_quality=80' \
  -i 'num_inference_steps=8'

To learn more, take a look at the Cog documentation.

Run this to download the model and run it in your local environment:

docker run -d -p 5000:5000 --gpus=all r8.im/jhorovitz/omini-schnell@sha256:c07c96d92cc1c2736cc8c17115fe245338a224bb2d9e08c25d6f5173a7ce83eb
curl -s -X POST \
  -H "Content-Type: application/json" \
  -d $'{
    "input": {
      "task": "subject_1024",
      "prompt": "A diecast toy car of this, it is photographed in a toy box with the word \\"Waymo\\", the box is on a shop shelf selling for $5.99",
      "num_outputs": 4,
      "control_image": "https://replicate.delivery/xezq/Za0hrhIO1PboAlVUVyfnqfkOX4y0gh6olDHcngJC9ifNIsgoA/tmpt_61zuhm.jpg",
      "output_format": "webp",
      "guidance_scale": 3.5,
      "output_quality": 80,
      "num_inference_steps": 8
    }
  }' \
  http://localhost:5000/predictions

To learn more, take a look at the Cog documentation.

Output

{
  "completed_at": "2025-02-17T22:00:52.866119Z",
  "created_at": "2025-02-17T22:00:29.981000Z",
  "data_removed": false,
  "error": null,
  "id": "xj9jcbjm3nrj60cn2qptesdd3m",
  "input": {
    "task": "subject_1024",
    "prompt": "A diecast toy car of this, it is photographed in a toy box with the word \"Waymo\", the box is on a shop shelf selling for $5.99",
    "num_outputs": 4,
    "control_image": "https://replicate.delivery/xezq/Za0hrhIO1PboAlVUVyfnqfkOX4y0gh6olDHcngJC9ifNIsgoA/tmpt_61zuhm.jpg",
    "output_format": "webp",
    "guidance_scale": 3.5,
    "output_quality": 80,
    "num_inference_steps": 8
  },
  "logs": "Using seed: 44905\n  0%|          | 0/8 [00:00<?, ?it/s]\n 12%|█▎        | 1/8 [00:00<00:04,  1.52it/s]\n 25%|██▌       | 2/8 [00:01<00:03,  1.65it/s]\n 38%|███▊      | 3/8 [00:01<00:03,  1.59it/s]\n 50%|█████     | 4/8 [00:02<00:02,  1.56it/s]\n 62%|██████▎   | 5/8 [00:03<00:01,  1.54it/s]\n 75%|███████▌  | 6/8 [00:03<00:01,  1.54it/s]\n 88%|████████▊ | 7/8 [00:04<00:00,  1.53it/s]\n100%|██████████| 8/8 [00:05<00:00,  1.53it/s]\n100%|██████████| 8/8 [00:05<00:00,  1.55it/s]\n  0%|          | 0/8 [00:00<?, ?it/s]\n 12%|█▎        | 1/8 [00:00<00:04,  1.52it/s]\n 25%|██▌       | 2/8 [00:01<00:03,  1.64it/s]\n 38%|███▊      | 3/8 [00:01<00:03,  1.58it/s]\n 50%|█████     | 4/8 [00:02<00:02,  1.56it/s]\n 62%|██████▎   | 5/8 [00:03<00:01,  1.54it/s]\n 75%|███████▌  | 6/8 [00:03<00:01,  1.53it/s]\n 88%|████████▊ | 7/8 [00:04<00:00,  1.53it/s]\n100%|██████████| 8/8 [00:05<00:00,  1.52it/s]\n100%|██████████| 8/8 [00:05<00:00,  1.54it/s]\n  0%|          | 0/8 [00:00<?, ?it/s]\n 12%|█▎        | 1/8 [00:00<00:04,  1.51it/s]\n 25%|██▌       | 2/8 [00:01<00:03,  1.64it/s]\n 38%|███▊      | 3/8 [00:01<00:03,  1.58it/s]\n 50%|█████     | 4/8 [00:02<00:02,  1.55it/s]\n 62%|██████▎   | 5/8 [00:03<00:01,  1.54it/s]\n 75%|███████▌  | 6/8 [00:03<00:01,  1.53it/s]\n 88%|████████▊ | 7/8 [00:04<00:00,  1.53it/s]\n100%|██████████| 8/8 [00:05<00:00,  1.52it/s]\n100%|██████████| 8/8 [00:05<00:00,  1.54it/s]\n  0%|          | 0/8 [00:00<?, ?it/s]\n 12%|█▎        | 1/8 [00:00<00:04,  1.51it/s]\n 25%|██▌       | 2/8 [00:01<00:03,  1.64it/s]\n 38%|███▊      | 3/8 [00:01<00:03,  1.58it/s]\n 50%|█████     | 4/8 [00:02<00:02,  1.55it/s]\n 62%|██████▎   | 5/8 [00:03<00:01,  1.54it/s]\n 75%|███████▌  | 6/8 [00:03<00:01,  1.53it/s]\n 88%|████████▊ | 7/8 [00:04<00:00,  1.52it/s]\n100%|██████████| 8/8 [00:05<00:00,  1.52it/s]\n100%|██████████| 8/8 [00:05<00:00,  1.54it/s]",
  "metrics": {
    "predict_time": 22.873556779,
    "total_time": 22.885119
  },
  "output": [
    "https://replicate.delivery/yhqm/71JMB1oY3g5rOJ904QBOUiKChKlAnQDNrUEYrelHKUFKHLIKA/out-0.webp",
    "https://replicate.delivery/yhqm/z6pUzffTbru1KETzvabVAjvmMOhyaDGmd70624RlBhMUOWQUA/out-1.webp",
    "https://replicate.delivery/yhqm/ji9YlD8N0vrYDBvOMNqCfj6SFG4wSWJALH7z8r4a1yYKHLIKA/out-2.webp",
    "https://replicate.delivery/yhqm/WMsCcbX8LN4TNVAwHGeh5tkQwAs3OdyEEuvjMuD5A1CKHLIKA/out-3.webp"
  ],
  "started_at": "2025-02-17T22:00:29.992562Z",
  "status": "succeeded",
  "urls": {
    "stream": "https://stream.replicate.com/v1/files/yswh-wjrqayp77zrf4wqmelygiqg3b4yxk7ob7razgqtv6pxxgclhdjtq",
    "get": "https://api.replicate.com/v1/predictions/xj9jcbjm3nrj60cn2qptesdd3m",
    "cancel": "https://api.replicate.com/v1/predictions/xj9jcbjm3nrj60cn2qptesdd3m/cancel"
  },
  "version": "98ccf187274f4237a514ed82cd372629b90a010e1203c956cae7f28364f1cafb"
}

Generated in

22.9 seconds

Tweak it ShareReport View full prediction

Using seed: 44905
  0%|          | 0/8 [00:00<?, ?it/s]
 12%|█▎        | 1/8 [00:00<00:04,  1.52it/s]
 25%|██▌       | 2/8 [00:01<00:03,  1.65it/s]
 38%|███▊      | 3/8 [00:01<00:03,  1.59it/s]
 50%|█████     | 4/8 [00:02<00:02,  1.56it/s]
 62%|██████▎   | 5/8 [00:03<00:01,  1.54it/s]
 75%|███████▌  | 6/8 [00:03<00:01,  1.54it/s]
 88%|████████▊ | 7/8 [00:04<00:00,  1.53it/s]
100%|██████████| 8/8 [00:05<00:00,  1.53it/s]
100%|██████████| 8/8 [00:05<00:00,  1.55it/s]
  0%|          | 0/8 [00:00<?, ?it/s]
 12%|█▎        | 1/8 [00:00<00:04,  1.52it/s]
 25%|██▌       | 2/8 [00:01<00:03,  1.64it/s]
 38%|███▊      | 3/8 [00:01<00:03,  1.58it/s]
 50%|█████     | 4/8 [00:02<00:02,  1.56it/s]
 62%|██████▎   | 5/8 [00:03<00:01,  1.54it/s]
 75%|███████▌  | 6/8 [00:03<00:01,  1.53it/s]
 88%|████████▊ | 7/8 [00:04<00:00,  1.53it/s]
100%|██████████| 8/8 [00:05<00:00,  1.52it/s]
100%|██████████| 8/8 [00:05<00:00,  1.54it/s]
  0%|          | 0/8 [00:00<?, ?it/s]
 12%|█▎        | 1/8 [00:00<00:04,  1.51it/s]
 25%|██▌       | 2/8 [00:01<00:03,  1.64it/s]
 38%|███▊      | 3/8 [00:01<00:03,  1.58it/s]
 50%|█████     | 4/8 [00:02<00:02,  1.55it/s]
 62%|██████▎   | 5/8 [00:03<00:01,  1.54it/s]
 75%|███████▌  | 6/8 [00:03<00:01,  1.53it/s]
 88%|████████▊ | 7/8 [00:04<00:00,  1.53it/s]
100%|██████████| 8/8 [00:05<00:00,  1.52it/s]
100%|██████████| 8/8 [00:05<00:00,  1.54it/s]
  0%|          | 0/8 [00:00<?, ?it/s]
 12%|█▎        | 1/8 [00:00<00:04,  1.51it/s]
 25%|██▌       | 2/8 [00:01<00:03,  1.64it/s]
 38%|███▊      | 3/8 [00:01<00:03,  1.58it/s]
 50%|█████     | 4/8 [00:02<00:02,  1.55it/s]
 62%|██████▎   | 5/8 [00:03<00:01,  1.54it/s]
 75%|███████▌  | 6/8 [00:03<00:01,  1.53it/s]
 88%|████████▊ | 7/8 [00:04<00:00,  1.52it/s]
100%|██████████| 8/8 [00:05<00:00,  1.52it/s]
100%|██████████| 8/8 [00:05<00:00,  1.54it/s]

This output was created using a different version of the model, jhorovitz/omini-schnell:98ccf187.

Examples

View more examples

Run time and cost

This model costs approximately $0.12 to run on Replicate, or 8 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 85 seconds. The predict time for this model varies significantly based on the inputs.

Readme

OminiControl - Subject Control for Diffusion Models

A minimal implementation for incorporating subject-specific control into pretrained Diffusion Transformer (DiT) models, focusing on preserving subject identity while generating new views and contexts.

Key Features

Lightweight control mechanism requiring only 0.1% additional parameters
Preserves subject identity and characteristics while allowing flexible pose/scene changes
Built for DiT-based models (tested on FLUX.1)
Simple integration using multi-modal attention rather than complex control modules

Training Data

The model is trained on Subjects200K, a dataset of 200,000+ paired images showing the same subject in different contexts. Each pair maintains consistent subject identity while varying:

Pose/angle
Lighting conditions
Background/environment
Context/scene

Limitations

Works best with clearly defined subjects/objects
Requires high-quality reference images
Performance may vary based on subject complexity

Citation

@article{tan2024ominicontrol,
  title={OminiControl: Minimal and Universal Control for Diffusion Transformer},
  author={Tan, Zhenxiong and Liu, Songhua and Yang, Xingyi and Xue, Qiaochu and Wang, Xinchao},
  journal={arXiv preprint arXiv:2411.15098},
  year={2024}
}

For more details on the full OminiControl framework and other control capabilities, please refer to the original paper.

jhorovitz / omini-schnell

Input

Output

Examples

Run time and cost

Readme

OminiControl - Subject Control for Diffusion Models

Key Features

Training Data

Limitations

Citation

Logs (xj9jcbjm3nrj60cn2qptesdd3m)