Readme
OminiControl - Subject Control for Diffusion Models
A minimal implementation for incorporating subject-specific control into pretrained Diffusion Transformer (DiT) models, focusing on preserving subject identity while generating new views and contexts.
Key Features
- Lightweight control mechanism requiring only 0.1% additional parameters
- Preserves subject identity and characteristics while allowing flexible pose/scene changes
- Built for DiT-based models (tested on FLUX.1)
- Simple integration using multi-modal attention rather than complex control modules
Training Data
The model is trained on Subjects200K, a dataset of 200,000+ paired images showing the same subject in different contexts. Each pair maintains consistent subject identity while varying:
- Pose/angle
- Lighting conditions
- Background/environment
- Context/scene
Limitations
- Works best with clearly defined subjects/objects
- Requires high-quality reference images
- Performance may vary based on subject complexity
Citation
@article{tan2024ominicontrol,
title={OminiControl: Minimal and Universal Control for Diffusion Transformer},
author={Tan, Zhenxiong and Liu, Songhua and Yang, Xingyi and Xue, Qiaochu and Wang, Xinchao},
journal={arXiv preprint arXiv:2411.15098},
year={2024}
}
For more details on the full OminiControl framework and other control capabilities, please refer to the original paper.