afiaka87 / sd-aesthetic-guidance

Use stable diffusion and aesthetic CLIP embeddings to guide boring outputs to be more aesthetically pleasing.

  • Public
  • 4.2K runs
  • GitHub

Input

Output

Run time and cost

This model runs on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 12 seconds. The predict time for this model varies significantly based on the inputs.

Readme

This method works by averaging pre-computed “aesthetically pleasing” embeddings with classifier-free guidance.

These embeddings were actually also used to create the LAION-aesthetic dataset used to train stable-diffusion. You can find saved npy files and more information here:

https://github.com/LAION-AI/aesthetic-predictor/tree/main/vit_l_14_embeddings

Example API Usage:

python3 -m pip install replicate
import replicate

sd_aesthetic_model = replicate.models.get("afiaka87/sd-aesthetic-guidance")

Test various scales/weights for aesthetic guidance:

seed = 42 # use a seed so images don't differ
for aesthetic_rating in range(5, 9):
    for aesthetic_weight in [0.0, 0.1, 0.2, 0.3, 0.4]: # runs the model 16 times
        predictions = sd_aesthetic_model.predict(
            prompt="an oil painting of Mark Hamil, digital art",
            aesthetic_rating=aesthetic_rating,
            aesthetic_weight=aesthetic_weight,
            seed=seed
        )
        print(predictions) # should be a list, you may want to download/display any image URL's