You're looking at a specific version of this model. Jump to the model overview.
mgonline2021 /test-tile:ff5ca118
            
              
                
              
            
            Input schema
          
        The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description | 
|---|---|---|---|
| input_images | 
            string
            
           | 
            A .zip or .tar file containing the image files that will be used for fine-tuning
           | |
| seed | 
            integer
            
           | 
            Random seed for reproducible training. Leave empty to use a random seed
           | |
| resolution | 
            integer
            
           | 
              768
             | 
            Square pixel resolution which your images will be resized to for training
           | 
| train_batch_size | 
            integer
            
           | 
              4
             | 
            Batch size (per device) for training
           | 
| num_train_epochs | 
            integer
            
           | 
              4000
             | 
            Number of epochs to loop through your training dataset
           | 
| max_train_steps | 
            integer
            
           | 
              1000
             | 
            Number of individual training steps. Takes precedence over num_train_epochs
           | 
| is_lora | 
            boolean
            
           | 
              True
             | 
            Whether to use LoRA training. If set to False, will use Full fine tuning
           | 
| unet_learning_rate | 
            number
            
           | 
              0.000001
             | 
            Learning rate for the U-Net. We recommend this value to be somewhere between `1e-6` to `1e-5`.
           | 
| ti_lr | 
            number
            
           | 
              0.0003
             | 
            Scaling of learning rate for training textual inversion embeddings. Don't alter unless you know what you're doing.
           | 
| lora_lr | 
            number
            
           | 
              0.0001
             | 
            Scaling of learning rate for training LoRA embeddings. Don't alter unless you know what you're doing.
           | 
| lora_rank | 
            integer
            
           | 
              32
             | 
            Rank of LoRA embeddings. Don't alter unless you know what you're doing.
           | 
| lr_scheduler | 
            None
            
           | 
              constant
             | 
            Learning rate scheduler to use for training
           | 
| lr_warmup_steps | 
            integer
            
           | 
              100
             | 
            Number of warmup steps for lr schedulers with warmups.
           | 
| token_string | 
            string
            
           | 
              TOK
             | 
            A unique string that will be trained to refer to the concept in the input images. Can be anything, but TOK works well
           | 
| caption_prefix | 
            string
            
           | 
              a photo of TOK, 
             | 
            Text which will be used as prefix during automatic captioning. Must contain the `token_string`. For example, if caption text is 'a photo of TOK', automatic captioning will expand to 'a photo of TOK under a bridge', 'a photo of TOK holding a cup', etc.
           | 
| mask_target_prompts | 
            string
            
           | 
            Prompt that describes part of the image that you will find important. For example, if you are fine-tuning your pet, `photo of a dog` will be a good prompt. Prompt-based masking is used to focus the fine-tuning process on the important/salient parts of the image
           | |
| crop_based_on_salience | 
            boolean
            
           | 
              True
             | 
            If you want to crop the image to `target_size` based on the important parts of the image, set this to True. If you want to crop the image based on face detection, set this to False
           | 
| use_face_detection_instead | 
            boolean
            
           | 
              False
             | 
            If you want to use face detection instead of CLIPSeg for masking. For face applications, we recommend using this option.
           | 
| clipseg_temperature | 
            number
            
           | 
              1
             | 
            How blurry you want the CLIPSeg mask to be. We recommend this value be something between `0.5` to `1.0`. If you want to have more sharp mask (but thus more errorful), you can decrease this value.
           | 
| verbose | 
            boolean
            
           | 
              True
             | 
            verbose output
           | 
| checkpointing_steps | 
            integer
            
           | 
              999999
             | 
            Number of steps between saving checkpoints. Set to very very high number to disable checkpointing, because you don't need one.
           | 
| input_images_filetype | 
            None
            
           | 
              infer
             | 
            Filetype of the input images. Can be either `zip` or `tar`. By default its `infer`, and it will be inferred from the ext of input file.
           | 
            
              
                
              
            
            Output schema
          
        The shape of the response you’ll get when you run this model with an API.
              Schema
            
            {'format': 'uri', 'title': 'Output', 'type': 'string'}