Laionide (version 3)
Direct comparison to OpenAI’s model using COCO captions
Shout out to stability.ai for donating the compute to laion needed for this to be possible.
Files: - laionide-v3-base.pt
Inference: - replicate - colab - locally
Results: - comparison to openai W&B report
Notes:
- You can use laionide-v2-sr.pt
to upscale the outputs from laionide-v3-base.pt
.
- There are watermarks in some outputs. You can try to prompt engineer this away, but it isn’t always possible. royalty free
seems to work well.
Training details:
- finetuned laionide-v2-base.pt
for 9 epochs on a subset of CC12M (~1.5 million pairs), COCO (~100K pairs), virtual genome (~100K pairs), and open images localized annotations (~800K pairs).
- 20% of unconditional/empty token, per the paper.