microsoft / textdiffuser-2

Academic and Research-only: Unleashing the Power of Language Models for Text Rendering

  • Public
  • 292 runs
  • GitHub
  • Paper

TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering

TextDiffuser-2 exhibits enhanced capability powered by language models. In addition to generating text with remarkable accuracy, TextDiffuser-2 provides plausible text layouts and demonstrates a diverse range of text styles.

Highlights

  • We propose TextDiffuser-2 which utilizes two language models for layout planning and layout encoding, increasing the flexibility and diversity in the process of text rendering.

  • TextDiffuser-2 alleviate several drawbacks in previous methods, such as (1) limited flexibility and automation, (2) constrained capability of layout prediction, and (3) Restricted style diversity.

  • TextDiffuser-2 is capable of handling text-to-image, text-to-image with template, and text inpainting tasks. Moreover, TextDiffuser-2 introduces an additional feature - it allows for the editing of generated layouts in a conversational manner.

  • ✨ We release the demo at link. Welcome to use and provide feedback.

Acknowledgement

We sincerely thank AK and hysts for helping set up the demo. We also feel thankful for the available code/api/demo of SDXL, PixArt, Ideogram, DALLE-3, and GlyphControl.

Disclaimer

Please note that the code is intended for academic and research purposes ONLY. Any use of the code for generating inappropriate content is strictly prohibited. The responsibility for any misuse or inappropriate use of the code lies solely with the users who generated such content, and this code shall not be held liable for any such use.

Contact

For help or issues using TextDiffuser-2, please email Jingye Chen (qwerty.chen@connect.ust.hk), Yupan Huang (huangyp28@mail2.sysu.edu.cn) or submit a GitHub issue.

For other communications related to TextDiffuser-2, please contact Lei Cui (lecu@microsoft.com) or Furu Wei (fuwei@microsoft.com).

Citation

If you find TextDiffuser-2 useful in your research, please consider citing:

@article{chen2023textdiffuser,
  title={TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering},
  author={Chen, Jingye and Huang, Yupan and Lv, Tengchao and Cui, Lei and Chen, Qifeng and Wei, Furu},
  journal={arXiv preprint arXiv:2311.16465},
  year={2023}
}