Stable Diffusion 3 (SD3) just dropped and you can run it in the cloud on Replicate, but it's also possible to run it locally using ComfyUI right from your own GPU-equipped machine.
Stable Diffusion 3 (SD3) is the latest version of Stability AI's tool that creates images from text. It's faster and makes better images than older versions. SD3 is really good at making complex scenes with lots of details and clear text from the prompt.
ComfyUI is a graphical user interface (GUI) for Stable Diffusion models like SD3. It lets you connect different AI models (called nodes) together to create custom images, just like connecting Lego blocks. You don't need to know any code to use it. ComfyUI makes it fun and simple to try out different combinations (called "workflows") and make unique pictures.
A special thanks to comfyanonymous for creating ComfyUI and fofr for creating a Cog compatiable version of it
To follow this guide, you'll need:
Start by installing Cog. It's a tool that sets up everything you need to package and run a machine learning model in a container environment. Open your terminal and run these commands:
sudo curl -o /usr/local/bin/cog -L "https://github.com/replicate/cog/releases/latest/download/cog_$(uname -s)_$(uname -m)" sudo chmod +x /usr/local/bin/cog
Next, clone the ComfyUI code from GitHub:
git clone --recurse-submodules git@github.com:fofr/cog-comfyui-sd3.git cd cog-comfyui-sd3
Start ComfyUI. Run this command to download weights and start the ComfyUI web server:
sudo cog run --use-cog-base-image -p 8188 /bin/bash -c "python scripts/get_weights.py workflow_api.json && cd ComfyUI/ && python main.py --listen 0.0.0.0"
This might take a while, so relax and wait for it to finish.
After ComfyUI starts, open your web browser and go to http://<YOUR-IP>:8188
. If you don't know your machine's IP address, run this command in your terminal:
hostname -I
That's it! You should now see the ComfyUI interface ready to use.
Don't worry if you don't have the SD3 workflow file. You can download it from GitHub using this command:
curl -O "https://raw.githubusercontent.com/fofr/cog-comfyui-sd3/main/workflow_api.json"
After downloading the workflow_api.json
file, open the ComfyUI GUI, click "Load," and select the workflow_api.json
file.
Go to the "CLIP Text Encode (Prompt)" node, which will have no text, and type what you want to see. For example, "cat on a fridge"
.
The "CLIP Text Encode (Negative Prompt)" node will already be filled with a list of things you don't want in the image, but feel free to change it.
Finally, click the "Queue Prompt" button to make your first image. Now you can start creating amazing visuals!