Using synthetic training data to improve Flux finetunes

Posted September 20, 2024 by

I know, I know. We keep blogging about Flux. But there's a reason: It's really good! People are making so much cool stuff with it, and its capabilities continue to expand as the open-source community experiments with it.

In this post I'll cover some techniques you can use to generate synthetic training data to help improve the accuracy, diversity, and stylistic range of your fine-tuned Flux models.

Getting started

To use the techniques covered in this post, you should have an existing fine-tuned Flux model that needs a little improvement.

If you haven't created your own fine-tuned Flux model yet, check out our guides to fine-tuning Flux on the web or fine-tuning Flux with an API, then come back to this post if you need tips to make it better.

What is synthetic data?

Synthetic data is artificially generated data that mimics real-world data. In the case of image generation models, synthetic data refers to images created by the model, rather than real photographs or human-generated artwork. Using synthetic data can help create more varied and comprehensive training datasets than using real-world images alone.

Tip 1: Generate training data from a single image

The consistent-character model is an image generator from the prolific and inimitable @fofr. It takes a single image of person as input and produces multiple images of them in a variety of poses, styles, and expressions. Using consistent-character is a great way to help jumpstart your Flux fine-tuning, especially if you don't have a lot of training images to start.

consistent character
The fofr/consistent-character model produces many images from a single input.

Here's a quick example of how to use consistent-character with the Replicate JavaScript client to generate a batch of training images from a single image input:

import Replicate from "replicate";
 
const replicate = new Replicate();
const model = "fofr/consistent-character:9c77a3c2f884193fcee4d89645f02a0b9def9434f9e03cb98460456b831c8772";
const input = {
  prompt: "A closeup headshot photo of a young woman in a grey sweater",
  subject: "https://replicate.delivery/pbxt/L0gy7uyLE5UP0uz12cndDdSOIgw5R3rV5N6G2pbt7kEK9dCr/0_3.webp",
  output_format: "webp",
  output_quality: 80,
  randomise_poses: true,
  number_of_outputs: 5,
  number_of_images_per_pose: 1
}
const output = await replicate.run(model, { input });
 
console.log(output);
/*
[
  "https://replicate.delivery/pbxt/tm3Hm7oJsQIWLleQ46JHKDA2XoNzjiJaaFifmK9GQb8jI45SA/ComfyUI_00001_.webp",
  "https://replicate.delivery/pbxt/HlWQ91wZRbrLMdPTndxQIz6pfdJSBXlENTTF8NMPvmKhE8cJA/ComfyUI_00002_.webp",
  "https://replicate.delivery/pbxt/BqsfHIp2r8X9Gy7wVPtmbgAaIcI6ke509VWCYDQKdXceSwzlA/ComfyUI_00003_.webp",
  "https://replicate.delivery/pbxt/S2bsYVLtLtpOJ9ZWdzd9bHbyay8f2opw4JpIocJQIBKCF8cJA/ComfyUI_00004_.webp",
  "https://replicate.delivery/pbxt/0PQLx9Zz5fQkb6ZQmldEB2ElI9e61eYCeqWiaZSbCzUIqgnLB/ComfyUI_00005_.webp"
]
*/

Tip 2: Use outputs from your fine-tuned model as training data

Sometimes when you fine-tune Flux, your trained model doesn't consistently produce the quality of images you want. Maybe one in ten of your outputs meets your expectations. Fortunately you can take those good outputs and use them as training data to train an improved version of your model.

The process works like this:

  1. Create a new fine-tune with just a handful of images.
  2. Run your model with an API to generate a large batch of images.
  3. Comb through the generated images and choose the good ones.
  4. Run a new training job using those outputs as training data.

To ease the process of generating lots of images from your model and downloading them to your local machine, you can use a tool like aimg. All you need to run aimg is a Replicate API token and a recent version of Node.js.

Here's a command that will generate 50 images using the exact same prompt each time:

# Create a new token at https://replicate.com/account/api-tokens
export REPLICATE_API_TOKEN=r8_...
 
# Generate 50 images
npx aimg "a photo of ZIKI the man" --model=zeke/ziki-flux --count 50

You can also get more variety in your outputs by using the --subject flag, which will auto-generate a unqiue prompt for each image:

# Generate 50 images, each with a unique prompt
npx aimg --subject "ZIKI the man" --model=zeke/ziki-flux --count 50

Once you've gathered a selection of images that you like, zip them up:

zip training-data.zip *.webp

Then kick off a new training job on the web or via the API.

Note: All Replicate models (including fine-tunes) are versioned, so you can use your existing model as the destination model when starting your second training job, and the training process will automatically create a new version of the model. Your first version will remain intact, and you'll still be able to access it and use it to generate images.

Tip 3: Combine LoRAs to diversify your training data

You may find that your fine-tuned Flux model is only outputting images in a single style, like a realistic photograph. You may be trying to generate images that look like paintings or illustrations, but it will only output photorealistic images, no matter how many painting-related keywords you put in your prompt.

A little-known feature of Flux fine-tunes on Replicate is that you can combine multiple LoRA styles in a single output image. LoRA stands for "Low-Rank Adaptation". I won't go into technical detail about how LoRAs work here, but the important thing to know is that it's become an industry term for "trained weights" in the context of fine-tuning image models. When you refer to "a LoRA", you're talking about a specific set of trained weights that get added to the base Flux model to constitute a "fine-tuned model".

Combining LoRAs
"ZIKI the man, illustrated MSMRB style", created by combining the zeke/ziki-flux human face model with the jakedahn/flux-midsummer-blues illustration style model.

Combining LoRAs is a really fun way of generating unique images, but you can also use it as a technique to diversify your training data to create better versions of your own fine-tuned Flux models.

At a high level, the process works like this:

  1. Create a fine-tuned model with whatever training data you have available.
  2. Explore LoRA fine-tunes from the community and pick a few that you like.
  3. Generate images with your fine-tuned model, combining it with the other LoRAs you selected.
  4. Comb through the outputs and select the ones that meet your expectations.
  5. Run a new training job using those outputs as training data.

To find LoRAs to combine with your model, check out the Flux fine-tunes on Replicate and Replicate LoRA fine-tunes on Hugging Face.

To generate images with combined LoRAs using the Replicate API, set the extra_lora and extra_lora_scale input parameters, and be sure to the use the trigger words from both models in your prompt.

Here's an example of how to generate images with combined LoRAs using the Replicate JavaScript client:

import Replicate from "replicate";
 
const replicate = new Replicate();
const model = "zeke/ziki-flux:dadc276a9062240e68f110ca06521752f334777a94f031feb0ae78ae3edca58e";
const input = {
  prompt: "ZIKI the man, illustrated MSMRB style",
  lora_scale: 1,
  extra_lora: "jakedahn/flux-midsummer-blues",
  extra_lora_scale: 1.22,
  num_outputs: 4,
  aspect_ratio: "1:1",
  guidance_scale: 3.5,
  output_quality: 80,
  prompt_strength: 0.8,
}
const output = await replicate.run(model, { input });
 
console.log(output);

The key things to keep in mind when combining LoRAs are:

  1. Be sure to use the trigger words from both models in your prompt to ensure that the LoRA styles are applied correctly.
  2. The extra_lora parameter should be set to the name of the LoRA you want to combine with your model. You can use the shorthand name of the model, like jakedahn/flux-midsummer-blues, or the full URL to a weights file.
  3. The extra_lora_scale parameter should be set to a value between -1 and 2. The higher the value, the more pronounced the extra LoRA style will be.
  4. Try balancing your multiple LoRAs by experimenting with their scales between 0.9 and 1.1

Have fun and iterate!

Hopefully these training tips will help you get the most out of your fine-tuned Flux model. The key to the fine-tuning process is experimentation and iteration. Try different techniques and see what works best for your use case.

Have fun and share your results with the community on X or Discord.