cjwbw / prompt-free-diffusion

Prompt-free Diffusion

  • Public
  • 738 runs
  • GitHub
  • Paper
  • License

Run time and cost

This model costs approximately $0.069 to run on Replicate, or 14 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 6 minutes. The predict time for this model varies significantly based on the inputs.

Readme

Prompt-Free Diffusion

Introduction

Prompt-Free Diffusion is a diffusion model that relys on only visual inputs to generate new images, handled by Semantic Context Encoder (SeeCoder) by substituting the commonly used CLIP-based text encoder. SeeCoder is reusable to most public T2I models as well as adaptive layers like ControlNet, LoRA, T2I-Adapter, etc. Just drop in and play!

Performance

Network

Citation

@article{xu2023prompt,
  title={Prompt-Free Diffusion: Taking" Text" out of Text-to-Image Diffusion Models},
  author={Xu, Xingqian and Guo, Jiayi and Wang, Zhangyang and Huang, Gao and Essa, Irfan and Shi, Humphrey},
  journal={arXiv preprint arXiv:2305.16223},
  year={2023}
}

Acknowledgement

Part of the codes reorganizes/reimplements code from the following repositories: Versatile Diffusion official Github and ControlNet sdwebui Github, which are also great influenced by LDM official Github and DDPM official Github