tencentarc / photomaker

Create photos, paintings and avatars for anyone in any style within seconds.

  • Public
  • 2.4M runs
  • GitHub
  • Paper
  • License



Run time and cost

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 26 seconds.



PhotoMaker is an image-to-image model for generating images in various styles from human photos. For more information about the model, visit the official project website.

Usage Tips:

  • The face in the uploaded image should occupy the majority of the image
  • Upload more photos of the person to be customized to improve ID fidelty.
  • When you enter a text prompt, make sure to follow the class word you want to customize with the trigger word: img, such as: “man img” or “woman img” or “girl img”. If the input is an Asian face(s), consider adding ‘asian’ before the class word, e.g., “asian woman img”
  • When stylizing, does the generated face look too realistic? Adjust the Style strength to 30-50. The larger the number, the less ID fidelty, but the stylization ability will be better. You could also try out other base models or LoRAs with good stylization effects.
  • For faster speed, reduce the number of generated images and sampling steps. However, please note that reducing the sampling steps may compromise the ID fidelity.


If you find PhotoMaker useful for your research and applications, please cite using this BibTeX:

  title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
  author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
  booktitle={arXiv preprint arxiv:2312.04461},