Last year, DreamBooth was released. It was a way to train Stable Diffusion on your own objects or styles.
A few short months later, Simo Ryu has created a new image generation model that applies a technique called LoRA to Stable Diffusion. Similar to DreamBooth, LoRA lets you train Stable Diffusion using just a few images, and it generates new output images with those objects or styles. Unlike DreamBooth, LoRA is fast: While DreamBooth takes around twenty minutes to run and produces models that are several gigabytes, LoRA trains in as little as eight minutes and produces models that are around 5MB.
LoRA stands for Low-Rank Adaptation, a mathematical technique to reduce the number of parameters that are trained. You can think of it like creating a diff of the model, instead of saving the whole thing. LoRA was developed by researchers at Microsoft, and Simo has applied it to Stable Diffusion. Check out the README for Simo's inference model on GitHub and the paper on arXiv to learn more about how it works.
We've been collaborating with Simo to get LoRA up on Replicate. You can now train LoRA models in the cloud with a single API call. Unlike DreamBooth where you had to wait for a model to push and boot up, LoRA predictions run instantly with no cold boots.
LoRA has a few differences from DreamBooth that make it especially appealing as an alternative:
🐴 To get an idea of what's possible, check out the LoRA examples page, where you can play around with some of our pretrained concepts like Bob Ross, Pokemon, South Park, Caravaggio, and more.
To train your own reusable LoRA concept, you'll do the following:
To train a new LoRA concept, create a zip file with a few images of the same face, object, or style. 5-10 images are enough, but for styles you may get better results if you have 20-100 examples. Many of the recommendations for training DreamBooth also apply to LoRA. The training images can be JPGs or PNGs.
💡 Give your zip file a meaningful name, as it will be included as part of the filename of the trained output. This will make it easier to identify and differentiate from other training outputs later.
LoRA's training model expects your images to be accessible over HTTP at a public URL. You can use a service like Google Drive, Amazon S3, or GitHub Pages to host your zip file.
There are two LoRA training models on Replicate:
Start by using the lora-training model to train your concept. Here's an example Python script that uses the training model to train a new concept:
The output of each training run is a single .safetensors
file at an HTTPS URL that we host indefinitely.
For example, https://replicate.delivery/pbxt/S8wVSt0vXr5mEFDjP5XkmMPjLPCaDmv1Rw6AzRMDEhoFqqGE/tmp_fs4evyhbob-ross.safetensors
Copy the URL of that trained concept file from your prediction response so you can use it as an input to LoRA's prediction model.
Now that you've got a trained concept, it's time to generate some new images! You can generate an image based on a single trained concept, or use multiple trained concepts together.
The prediction model replicate/lora requires two inputs:
prompt
: A prompt that contains the string <1>
where the trained concept should be, e.g. an astronaut riding a horse in the style of <1>
. Use <2>
, <3>
if you're passing multiple URLs to the lora_urls
input.lora_urls
: The URL or URLs of your trained LoRA concept(s) you copied in the previous step. You can pass a single URL, or a list of URLs separated by a pipe character |
. Passing multiple URLs allows you to combine multiple concepts into a single image.You can run LoRA's prediction model from your browser:
You can also run LoRA's prediction model using Replicate. Here's an example Python script that uses the API to generate a new image:
In the next couple of weeks we'll add support for training LoRA on Stable Diffusion 2.1, inpainting, and other cool things. Let us know your ideas!
If you want to share your LoRA models with the community or see what others come up with, join the #lora channel in our Discord.