Upscaling images with ControlNet tile

ControlNet tile is a ControlNet model (control_v11f1e_sd15_tile) introduced by Lvmin Zhang (lllyasviel) in May 2023.

You can run it on Replicate with the following models:

Use ControlNet tile to:

  • upscale 2x, 4x or 8x times
  • add, change, or re-generate image details
  • fix mistakes from upscaling by other models (like blurring or incoherent details)

It is very good at:

  • upscaling medium to large images
  • introducing coherent new details
  • controlling your upscale with a prompt
  • applying an artistic style or direction in the upscale process

However, it:

  • does not handle small images well. Anything below 512x512 should be upscaled with something like Real-ESRGAN or SwinIR first. Ultimate SD Upscale can do this for you.
  • does not remove compression artifacts like JPEG artifacts. Again, use another model first.
  • often loses likenesses in faces. Careful selection of parameters is needed to balance between original image strength and hallucinated details.
  • can be tricky to use, and often requires some experimentation to get the best results

It upscales by hallucinating new details. You can use a text prompt (and negative prompt), to guide the generation of details towards your desired image.

As the name suggests, it is a tile-based approach. The original image is split into tiles, and each tile is upscaled separately before being recombined. Previous tiling approaches were limited by the way each tile interpreted a given prompt. Consider a ‘photo of a man outside’, split into 9, where the top left tile is just sky. Early techniques would try to diffuse an image of a man into that space, as well as every other tile, without a wider understanding of the whole image. ControlNet tile is different.

The clever part with ControlNet tile is that despite the tiles being upscaled separately, the prompt is always applied to the whole image.

ControlNet tile is best used with any Stable Diffusion 1.5 fine-tune. RealisticVision V5.1 and above are a good choice for photorealistic upscaling. There isn’t currently a ControlNet tile model for SDXL.

In this example of a woman in a bright outfit, the image is upscaled from 1024x1024 (a standard SDXL sized output) to 2560x2560 (2.5x) in 38 seconds.

You can clearly see the issues ControlNet tile can have with maintaining a likeness here. Not only has this woman’s likeness changed, but so has her ethnicity. You’ll also see that her face is no longer distorted, and that the pose and colors have remained consistent. Small details like the buttons and the shirt pattern are also fixed. Meanwhile new details have been introduced - the woman is now wearing a necklace, she is wearing less makeup and the shiny material of the outfit is more pared back.

In another example, we upscale a photo of a cat using the fewjative/ultimate-sd-upscale model. The original image is 1024x1024, and we upscale it 2x to 2048x2048 in 49 seconds. We use a prompt to describe the image, and a negative prompt to list things we don’t want.

See how the detail in the cat’s fur is improved, and AI generation errors like the zip and eyes are fixed.

Using a ControlNet tile upscaler in the cloud is a great way to improve the quality of images in your website or app. Whether they are from a generative AI model or a real-world source.

We’ll use the batouresearch/high-resolution-controlnet-tile for this example, but the API is similar whichever model you choose.

Start with our official JavaScript client to run the model with Node.js:

npm install replicate
export REPLICATE_API_TOKEN=<your-api-token>

Run batouresearch/high-resolution-controlnet-tile:

import Replicate from "replicate";

// Import and set up the client
const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

// Run the model
const output = await replicate.run(
  "batouresearch/high-resolution-controlnet-tile:latest",
  {
    input: {
      hdr: 0.2,
      image: "https://replicate.delivery/pbxt/K5vab12temDjc8jOnWFJ3p4RD7YWPQ3nuCyXpaRmN9yB8M1h/f84e7869-32ca-444b-a720-19e4325f4347.jpeg",
      steps: 20,
      prompt: "a woman wearing a colorful suit",
      scheduler: "DDIM",
      creativity: 0.6,
      guess_mode: false,
      resolution: 2560,
      resemblance: 0.4,
      guidance_scale: 5
    }
  }
);
console.log(output);

You can also run this model using other Replicate client libraries such as for Python, Golang, Swift, Elixir, and others.