cuuupid / idm-vton

Best-in-class clothing virtual try on in the wild (non-commercial use only)

  • Public
  • 660.8K runs
  • A100 (80GB)
  • GitHub
  • Paper
  • License

Input

*file
Preview
garm_img

Garment, should match the category, can be a product image or even a photo of someone

string
Shift + Return to add a new line

Description of garment e.g. Short Sleeve Round Neck T-shirt

*file
Preview
human_img

Model, if this is not 3:4 check crop

string

Category of garment

Default: "upper_body"

integer
(minimum: 1, maximum: 40)

Default: 30

Including mask_img and 4 more...

Output

output
Generated in

This example was created by a different version, cuuupid/idm-vton:e3893af4.

Run time and cost

This model costs approximately $0.024 to run on Replicate, or 41 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 18 seconds.

Readme

Non-Commercial use only!

This is the current best-in-class virtual try-on model, created by the Korea Advanced Institute of Science & Technology (KAIST). It’s capable of virtual try-on “in the wild” which has notoriously been difficult for generative models to tackle, until now!

IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild

This is an official implementation of paper ‘Improving Diffusion Models for Authentic Virtual Try-on in the Wild’ - paper - project page

teaser  teaser2 

TODO LIST

  • [x] demo model
  • [x] inference code
  • [ ] training code

Acknowledgements

For the demo, auto masking generation codes are based on OOTDiffusion and DCI-VTON.
Parts of the code are based on IP-Adapter.

Citation

@article{choi2024improving,
  title={Improving Diffusion Models for Virtual Try-on},
  author={Choi, Yisol and Kwak, Sangkyung and Lee, Kyungmin and Choi, Hyungwon and Shin, Jinwoo},
  journal={arXiv preprint arXiv:2403.05139},
  year={2024}
}

License

The codes and checkpoints in this repository are under the CC BY-NC-SA 4.0 license.