chenxwh / meissonic

Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

  • Public
  • 37 runs
  • GitHub
  • Weights
  • Paper
  • License
Iterate in playground

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Meissonic Banner

Meissonic Demos

๐Ÿš€ Introduction

Meissonic is a non-autoregressive mask image modeling text-to-image synthesis model that can generate high-resolution images. It is designed to run on consumer graphics cards.

Key Features:
- ๐Ÿ–ผ๏ธ High-resolution image generation (up to 1024x1024)
- ๐Ÿ’ป Designed to run on consumer GPUs
- ๐ŸŽจ Versatile applications: text-to-image, image-to-image

๐Ÿ“š Citation

If you find this work helpful, please consider citing:

@article{bai2024meissonic,
  title={Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis},
  author={Bai, Jinbin and Ye, Tian and Chow, Wei and Song, Enxin and Chen, Qing-Guo and Li, Xiangtai and Dong, Zhen and Zhu, Lei and Yan, Shuicheng},
  journal={arXiv preprint arXiv:2410.08261},
  year={2024}
}

๐Ÿ™ Acknowledgements

We thank the community and contributors for their invaluable support in developing Meissonic. We thank apolinario@multimodal.art for making Meissonic Demo. We thank @NewGenAI and @้ฃ›้ทนใ—ใšใ‹@่‡ช็งฐๆ–‡็ณปใƒ—ใƒญใ‚ฐใƒฉใƒžใฎๅ‹‰ๅผท for making YouTube toturials. We thank @pprp for making fp8 and int4 quantization. We thank @camenduru for making jupyter toturial