You're looking at a specific version of this model. Jump to the model overview.
edenartlab /sdxl-lora-trainer:4cd07dd8
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
name |
string
|
unnamed
|
Name of new LORA concept
|
lora_training_urls |
string
|
Training images for new LORA concept (can be image urls or a .zip file of images)
|
|
concept_mode |
string
|
object
|
'face' / 'style' / 'object' (default)
|
seed |
integer
|
Random seed for reproducible training. Leave empty to use a random seed
|
|
resolution |
integer
|
960
|
Square pixel resolution which your images will be resized to for training recommended [768-1024]
|
train_batch_size |
integer
|
4
|
Batch size (per device) for training
|
num_train_epochs |
integer
|
10000
|
Number of epochs to loop through your training dataset
|
max_train_steps |
integer
|
600
|
Number of individual training steps. Takes precedence over num_train_epochs
|
checkpointing_steps |
integer
|
10000
|
Number of steps between saving checkpoints. Set to very very high number to disable checkpointing, because you don't need one.
|
is_lora |
boolean
|
True
|
Whether to use LoRA training. If set to False, will use Full fine tuning
|
prodigy_d_coef |
number
|
0.8
|
Multiplier for internal learning rate of Prodigy optimizer
|
ti_lr |
number
|
0.001
|
Learning rate for training textual inversion embeddings. Don't alter unless you know what you're doing.
|
ti_weight_decay |
number
|
0.0003
|
weight decay for textual inversion embeddings. Don't alter unless you know what you're doing.
|
lora_weight_decay |
number
|
0.002
|
weight decay for lora parameters. Don't alter unless you know what you're doing.
|
l1_penalty |
number
|
0.1
|
Sparsity penalty for the LoRA matrices, increases merge-ability and maybe generalization
|
lora_param_scaler |
number
|
0.5
|
Multiplier for the starting weights of the lora matrices
|
snr_gamma |
number
|
5
|
see https://arxiv.org/pdf/2303.09556.pdf, set to None to disable snr training
|
lora_rank |
integer
|
12
|
Rank of LoRA embeddings. For faces 5 is good, for complex concepts / styles you can try 8 or 12
|
caption_prefix |
string
|
|
Prefix text prepended to automatic captioning. Must contain the 'TOK'. Example is 'a photo of TOK, '. If empty, chatgpt will take care of this automatically
|
left_right_flip_augmentation |
boolean
|
True
|
Add left-right flipped version of each img to the training data, recommended for most cases. If you are learning a face, you prob want to disable this
|
augment_imgs_up_to_n |
integer
|
20
|
Apply data augmentation (no lr-flipping) until there are n training samples (0 disables augmentation completely)
|
n_tokens |
integer
|
2
|
How many new tokens to inject per concept
|
mask_target_prompts |
string
|
Prompt that describes most important part of the image, will be used for CLIP-segmentation. For example, if you are learning a person 'face' would be a good segmentation prompt
|
|
crop_based_on_salience |
boolean
|
True
|
If you want to crop the image to `target_size` based on the important parts of the image, set this to True. If you want to crop the image based on face detection, set this to False
|
use_face_detection_instead |
boolean
|
False
|
If you want to use face detection instead of CLIPSeg for masking. For face applications, we recommend using this option.
|
clipseg_temperature |
number
|
0.6
|
How blurry you want the CLIPSeg mask to be. We recommend this value be something between `0.5` to `1.0`. If you want to have more sharp mask (but thus more errorful), you can decrease this value.
|
verbose |
boolean
|
True
|
verbose output
|
run_name |
string
|
1717022065
|
Subdirectory where all files will be saved
|
debug |
boolean
|
False
|
for debugging locally only (dont activate this on replicate)
|
hard_pivot |
boolean
|
False
|
Use hard freeze for ti_lr. If set to False, will use soft transition of learning rates
|
off_ratio_power |
number
|
0.1
|
How strongly to correct the embedding std vs the avg-std (0=off, 0.05=weak, 0.1=standard)
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'items': {'properties': {'attributes': {'title': 'Attributes',
'type': 'object'},
'files': {'default': [],
'items': {'format': 'uri',
'type': 'string'},
'title': 'Files',
'type': 'array'},
'isFinal': {'default': False,
'title': 'Isfinal',
'type': 'boolean'},
'name': {'title': 'Name', 'type': 'string'},
'progress': {'title': 'Progress', 'type': 'number'},
'thumbnails': {'default': [],
'items': {'format': 'uri',
'type': 'string'},
'title': 'Thumbnails',
'type': 'array'}},
'title': 'CogOutput',
'type': 'object'},
'title': 'Output',
'type': 'array',
'x-cog-array-type': 'iterator'}