Generate images from a text prompt
50.9K runs

Run time and cost

Predictions run on Nvidia A100 GPU hardware. Predictions typically complete within 68 seconds. The predict time for this model varies significantly based on the inputs.


Our logo was generated with DALL·E mini using the prompt "logo of an armchair in the shape of an avocado".

Original DALL·E from "Zero-Shot Text-to-Image Generation" with image quantization from "Learning Transferable Visual Models From Natural Language Supervision".

Image encoder from "Taming Transformers for High-Resolution Image Synthesis".

Sequence to sequence model based on "BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension" with implementation of a few variants:

Main optimizer (Distributed Shampoo) from "Scalable Second Order Optimization for Deep Learning".