shefa / turbo-enigma

SDXL based text-to-image model applying Distribution Matching Distillation, supporting zero-shot identity generation in 2-5s. https://ai-visionboard.com

  • Public
  • 3.8M runs
  • Paper
  • License

The heart of AI VisionBoard app.

Turbo enigma used in:

Implementation of One-step Diffusion with Distribution Matching Distillation.

This model rivals the efficiency of SDXL-Turbo and LCM, yet surpasses both in delivering superior image quality. It is also more than 10x faster than Photomaker and InstantID, when creating images from a single reference photo (e.g. selfie).

Update 02/2024 - Added zero-shot faceswap with various hacks, achieving more than 10x speed boost over Photomaker or InstandID, while preserving quality - the model runs in under 2 seconds on Replicate, where Photomaker / InstanID run for 20-40seconds - see the examples for yourself!

Update 01/2024 - implemented Fast Diffusion optimization, now runs in ~2 seconds on Replicate, added NSFW filter input in API.

Advantages:

  • Speed: One of the fastest models in its class, enabling rapid generation of complex images.
  • Quality: Despite its speed, the model maintains a high standard of image quality, producing vivid and detailed visuals out-of-the-box.
  • Versatility: Capable of interpreting and visualizing a wide range of textual inputs with accuracy and creativity, based on community merges of best SDXL checkpoints before distillation.

Use Cases:

This model is perfect for scenarios requiring quick turnaround without sacrificing image quality – ideal for content creators, digital artists, and businesses seeking efficient visual representation of ideas. Possible application in video generation as the model can run generate ~15fps @ 720p

Future plans:

Model is still very much WIP and testing different post-training improvements (merges) to enhance both speed and quality.

Comparison vs SDXL-Turbo / LCM / Midjourney

TODO(shefa): Coming soon.