Readme
Dataset Preparation Guide
Image Editing Training (AI Toolkit)
This trainer endpoint fine-tunes an image editing diffusion model (e.g. inpainting, instruction-based editing) using ai-toolkit in the background.
Image editing training requires paired data: an original image, an edited target image, and optionally a mask.
1. Dataset Format (Required)
Your dataset must be a single folder containing image pairs. Each edit example is identified by a shared base filename.
Folder Structure
dataset/
├── edit_001_input.png
├── edit_001_target.png
├── edit_001.txt
├── edit_002_input.jpg
├── edit_002_target.jpg
├── edit_002.txt
Optional (for inpainting / masked editing)
dataset/
├── edit_003_input.png
├── edit_003_target.png
├── edit_003_mask.png
├── edit_003.txt
Naming Rules
_input→ original image_target→ edited result (what the model should produce)_mask→ optional binary mask (white = editable area).txt→ edit instruction
All files must share the same base name (e.g. edit_003).
2. Instruction Files (.txt)
Each .txt file contains the editing instruction describing how to transform the input image into the target image.
Example
edit_001.txt
replace the background with a snowy mountain landscape
Instruction Guidelines
- Describe only the change
- Do not restate the full image description
- Write instructions as imperative edits
- One instruction per file (multi-line allowed)
3. Mask Files (Optional but Recommended)
For tasks such as: - inpainting - object replacement - localized edits
you can include a mask image.
Mask Rules
- Same resolution as input image
- White (255) = editable region
- Black (0) = frozen region
- Format:
.pngrecommended
If no mask is provided, the model assumes global editing.
4. Image Requirements
ai-toolkit automatically handles resizing and bucketing.
Recommended
- Resolution: ≥ 512×512
- Formats:
.jpg,.png,.webp - Input and target images must:
- have the same resolution
- be pixel-aligned
- differ only in the edited regions
Avoid
- Misaligned input/target pairs
- Style changes unrelated to the instruction
- Multiple edits per example
5. Dataset Size Recommendations
| Use Case | Recommended Pairs |
|---|---|
| Simple edits (background, color) | 100–300 |
| Inpainting / object replacement | 200–500 |
| Instruction-following editing | 500+ |
Editing models require more data than concept fine-tuning.
6. What Not to Include
- Missing input/target pairs
- Incorrect filename suffixes
- Non-binary masks
- Captions instead of edit instructions
- Copyrighted or unlicensed images
7. Uploading the Dataset
Once your dataset folder is ready:
- Upload the dataset folder to the trainer endpoint
- Select the editing mode (global / masked)
- Start training
The trainer validates all image pairs before launching training.
8. Minimal Example Dataset
my_edit_dataset/
├── room_01_input.png
├── room_01_target.png
├── room_01_mask.png
├── room_01.txt → replace the sofa with a wooden table
├── room_02_input.jpg
├── room_02_target.jpg
├── room_02.txt → change the lighting to warm sunset light
Need Help?
If you are unsure whether your dataset is valid or want feedback on edit instructions, reach out before starting training — most training failures come from dataset issues.