This is an implementation of PixArt-alpha/PixArt-LCM-XL-2-1024-MS. Inspired by the Huggingface space: PixArt-alpha/PixArt-LCM
About
Pixart-α consists of pure transformer blocks for latent diffusion: It can directly generate 1024px images from text prompts within a single sampling process.
LCMs is a diffusion distillation method which predict PF-ODE’s solution directly in latent space, achieving super fast inference with few steps.
Source code of PixArt-LCM is available at https://github.com/PixArt-alpha/PixArt-alpha.
Model Description
- Developed by: Pixart & LCM teams
- Model type: Diffusion-Transformer-based text-to-image generative model
- License: CreativeML Open RAIL++-M License
- Model Description: This is a model that can be used to generate and modify images based on text prompts. It is a Transformer Latent Diffusion Model that uses one fixed, pretrained text encoders (T5)) and one latent feature encoder (VAE).
- Resources for more information: Check out our PixArt-α, LCM GitHub Repository and the Pixart-α, LCM reports on arXiv.