Readme
Dataset
The dataset used for training is composed of 80 images with a resolution of 512x512 pixels.
Caption method
Initially the captions were created using taggui, a first pass of auto generated captions using florence-large-ft was done before manually reviewing each caption.
However the best results were obtained form a training without captions, after the fist version all successive trainings were done without capions.
Training method
The LORA is trained using replicate ai-toolkit. The training used 6000 steps, a learning rate of 0.0004 and a network dimension of 32.
Recommended settings
Experimenting the model lead to finding the following settings give a good result
- prompt: including “SLOW3D” gives stronger results, follow with any description you prefer (the model is trained on women images so best results are archived when generating women)
- lora_scale: 0.8
- num_inference_steps: 28
- model: dev
- guidance_scale: 3.5 or 2.5