Run time and cost

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 7 minutes. The predict time for this model varies significantly based on the inputs.


Model description

This model was created by a team of engineers at ML6 ( based on minDALL-E ( which combines a VQGAN and a GPT-like transformer of 1.3b parameters. The version implemented here generates three batches of forty images which are then sorted using OpenAI CLIP.

For more details on the model, check out this blogpost

Intended use

Ethical considerations

Caveats and recommendations

