chenxwh / ominicontrol-spatial

Minimal and Universal Control for Diffusion Transformer - demo for Spatially aligned control

  • Public
  • 101 runs
  • L40S
  • GitHub
  • Weights
  • Paper
  • License

Input

image
string

Choose a task

Default: "fill"

string
Shift + Return to add a new line

Input prompt.

Default: "The Mona Lisa is wearing a white VR headset with 'Omini' written on it."

*file

Input image

integer
(minimum: 1, maximum: 500)

Number of denoising steps

Default: 50

number
(minimum: 1, maximum: 20)

Scale for classifier-free guidance

Default: 7.5

integer

Random seed. Leave blank to randomize the seed

Output

output
Generated in

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

OminiControl: Minimal and Universal Control for Diffusion Transformer

This is the demo for Spatially aligned control. See https://replicate.com/chenxwh/ominicontrol-subject for Subject-driven generation.

Features

OminiControl is a minimal yet powerful universal control framework for Diffusion Transformer models like FLUX.

  • Universal Control 🌐: A unified control framework that supports both subject-driven control and spatial control (such as edge-guided and in-painting generation).

  • Minimal Design 🚀: Injects control signals while preserving original model structure. Only introduces 0.1% additional parameters to the base model.

Citation

@article{
  tan2024omini,
  title={OminiControl: Minimal and Universal Control for Diffusion Transformer},
  author={Zhenxiong Tan, Songhua Liu, Xingyi Yang, Qiaochu Xue, and Xinchao Wang},
  journal={arXiv preprint arXiv:2411.15098},
  year={2024}
}