tencentarc / photomaker-style

Create photos, paintings and avatars for anyone in any style within seconds. (Stylization version)

  • Public
  • 298.3K runs
  • GitHub
  • Paper
  • License

Input

Output

Run time and cost

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 11 seconds.

Readme

PhotoMaker

PhotoMaker is an image-to-image model for generating images in various styles from human photos. For more information about the model, visit the official project website.

Usage

  • For photorealistic generation, use the other model.

1️⃣ Upload images of someone you want to customize. One image is ok, but more is better. Although we do not perform face detection, the face in the uploaded image should occupy the majority of the image. 2️⃣ Enter a text prompt, making sure to follow the class word you want to customize with the trigger word: img, such as: man img or woman img or girl img. 3️⃣ Choose your preferred style template.

Usage Tips:

  • Upload more photos of the person to be customized to improve ID fidelity. If the input is Asian face(s), maybe consider adding ‘asian’ before the class word, e.g., asian woman img
  • When stylizing, does the generated face look too realistic? Adjust the Style strength to 30-50, the larger the number, the less ID fidelty, but the stylization ability will be better. You could also try out other base models or LoRAs with good stylization effects.
  • For faster speed, reduce the number of generated images and sampling steps. However, please note that reducing the sampling steps may compromise the ID fidelity.

BibTeX

If you find PhotoMaker useful for your research and applications, please cite using this BibTeX:

@article{li2023photomaker,
  title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
  author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
  booktitle={arXiv preprint arxiv:2312.04461},
  year={2023}
}