shefa/turbo-enigma

SDXL based text-to-image model applying Distribution Matching Distillation, supporting zero-shot identity generation in 2-5s. https://ai-visionboard.com

  • Public
  • 486.4K runs

Run time and cost

This model runs on Nvidia A40 GPU hardware. Predictions typically complete within 6 seconds.

Readme

The heart of AI VisionBoard app.

Turbo enigma used in:

WIP implementation of One-step Diffusion with Distribution Matching Distillation.

This model rivals the efficiency of SDXL-Turbo and LCM, yet surpasses both in delivering superior image quality.

Update 02/2024 - Added zero-shot faceswap with various hacks, achieving more than 10x speed boost over Photomaker or InstandID, while preserving quality.

Update 01/2024 - implemented Fast Diffusion optimization, now runs in ~2 seconds on Replicate, added NSFW filter input in API.

Advantages:

  • Speed: One of the fastest models in its class, enabling rapid generation of complex images.
  • Quality: Despite its speed, the model maintains a high standard of image quality, producing vivid and detailed visuals out-of-the-box.
  • Versatility: Capable of interpreting and visualizing a wide range of textual inputs with accuracy and creativity, based on community merges of best SDXL checkpoints before distillation.

Use Cases:

This model is perfect for scenarios requiring quick turnaround without sacrificing image quality – ideal for content creators, digital artists, and businesses seeking efficient visual representation of ideas. Possible application in video generation as the model can run generate ~15fps @ 720p

Future plans:

Model is still very much WIP and testing different post-training improvements (merges) to enhance both speed and quality.

Comparison vs SDXL-Turbo / LCM / Midjourney

TODO(shefa): Coming soon.