adirik / leditsplusplus

LEdits++ for image editing

  • Public
  • 497 runs
  • GitHub
  • Paper
  • License

Input

Output

Run time and cost

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 28 seconds. The predict time for this model varies significantly based on the inputs.

Readme

LEdits++

LEdits++ is a textual image editing method for Stable Diffusion XL and variants. See the paper, Hugging Face demo and docs for details.

How to use the API

To edit an image with LEdits++, upload an image and specify the objects you would like to remove or add as a comma separated string. In order to edit an object in place (e.g. changing an apple to an orange), both nouns need to included in the editing_prompts as removal and addition targets respectively. The full list of API arguments are as follows:

  • image: Input image to edit.
  • num_inversion_steps: Number of image inversion steps to retrieve the image latent code.
  • negative_prompt: Negative prompt for the first text encoder to guide the image generation. optional, defaults to None.
  • source_prompt: Prompt describing the input image that will be used for guidance during inversion. Guidance is disabled if the source_prompt is "".
  • source_guidance_scale: Strength of guidance during inversion.
  • skip: Portion of initial steps that will be ignored for inversion and subsequent generation. Lower values will lead to stronger changes to the input image.
  • negative_prompt2: Negative prompt for the second text encoder to guide the image generation. optional, defaults to None if negative_prompt is also left empty, alternatively defaults to negative_prompt otherwise.
  • editing_prompts: Comma separated objects to add, remove or edit. Defaults to None, which inverts and reconstructs the input image.
  • reverse_editing_directions: Comma separated True or False boolean values indicating whether the corresponding prompt in editing_prompts should be increased or decreased to add, remove or edit. optional, defaults to False.
  • edit_guidance_scale: Comma separated float values for each change specified in editing prompts list. optional, defaults to 5 if left empty.
  • edit_warmup_steps: Number of diffusion steps (for each prompt) for which guidance is not applied.
  • edit_threshold: Comma separated edit threshold float values for each editing prompt, threshold values should be proportional to the image region that is modified. optional, defaults to 0.9 if left empty.

LEdits++ only supports DPMSolverMultistepScheduler (default) and DDIMScheduler. Images that are larger than 1024x1024 are downsampled while preserving the aspect ratio.

References

@article{Brack2023LEDITSLI,
  title={LEDITS++: Limitless Image Editing using Text-to-Image Models},
  author={Manuel Brack and Felix Friedrich and Katharina Kornmeier and Linoy Tsaban and Patrick Schramowski and Kristian Kersting and Apolin'ario Passos},
  journal={ArXiv},
  year={2023},
  volume={abs/2311.16711},
  url={https://api.semanticscholar.org/CorpusID:265466786}
}