andreasjansson / flash-eval

Suite of models to evaluate the image quality of text-to-image models with respect to their input prompts.

  • Public
  • 1.8K runs
  • GitHub
  • Paper

Run time and cost

This model costs approximately $0.37 to run on Replicate, or 2 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 5 minutes. The predict time for this model varies significantly based on the inputs.

Readme

A suite of models to evaluate the image quality of text-to-image models with respect to their input prompts.

Github repo: https://github.com/thu-nics/FlashEval

Models

CLIP

  • Measures text-image alignment using CLIP. It evaluates how well the image matches the given text prompt.
  • Link

BLIP

  • Evaluates text-to-image alignment using BLIP, assessing how well the image matches the text description.
  • Link

Aesthetic

  • Assesses the aesthetic quality of an image, predicting how visually appealing it is to humans.
  • Link

ImageReward

  • A human preference model that predicts which images humans would prefer based on the given prompt.
  • Link

PickScore

  • A human preference model that predicts which images humans would prefer based on the given prompt.
  • Link