dhanushreddy291 / amused-text-to-image

Amused is a lightweight text to image model based off of the muse architecture. Amused is particularly useful in applications that require a lightweight and fast model such as generating many images quickly at once.

  • Public
  • 196 runs
  • L40S
  • GitHub
  • Paper
  • License

Input

string
Shift + Return to add a new line

Input prompt

Default: "a cute minimalistic simple capybara side profile, in the style of Jon Klassen, desaturated light and airy pastel color palette, nursery art, white background"

string
Shift + Return to add a new line

Input Negative Prompt

Default: "3d, cgi, render, bad quality, normal quality"

integer
(minimum: 1, maximum: 4)

Number of images to output.

Default: 1

number

Guidance Scale

Default: 10

integer
(minimum: 10, maximum: 50)

Number of inference steps

Default: 30

integer

Random seed. Leave blank to randomize the seed

Output

output
Generated in

Run time and cost

This model costs approximately $0.0030 to run on Replicate, or 333 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 4 seconds. The predict time for this model varies significantly based on the inputs.

Readme

aMUSEd is based on Masked Image Modeling. It makes for a compelling use case for the community to explore components that are known to work in language modeling in the context of image generation.

Learn more at: https://huggingface.co/blog/amused