A suite of models to evaluate the image quality of text-to-image models with respect to their input prompts.
Github repo: https://github.com/thu-nics/FlashEval
Models
CLIP
- Measures text-image alignment using CLIP. It evaluates how well the image matches the given text prompt.
- Link
BLIP
- Evaluates text-to-image alignment using BLIP, assessing how well the image matches the text description.
- Link
Aesthetic
- Assesses the aesthetic quality of an image, predicting how visually appealing it is to humans.
- Link
ImageReward
- A human preference model that predicts which images humans would prefer based on the given prompt.
- Link
PickScore
- A human preference model that predicts which images humans would prefer based on the given prompt.
- Link