zsxkib / bsrgan

Upscale videos + images with BSRGAN

  • Public
  • 1.3K runs
  • L40S
  • GitHub
  • Weights
  • Paper
  • License

Input

*file

Input image or video file

integer

Upscaling factor (2x or 4x)

Default: 4

Output

Generated in

Run time and cost

This model costs approximately $0.044 to run on Replicate, or 22 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 46 seconds. The predict time for this model varies significantly based on the inputs.

Readme

BSRGAN: Practical Image and Video Restoration 🖼️🎥

A deep learning model designed to improve low-quality images and videos by addressing real-world degradations like blur, noise, and compression artifacts.

What It Does

BSRGAN helps restore visual content by: - Enhancing resolution (2x or 4x upscaling) - Reducing blur and noise in old photos - Improving compressed/low-quality video frames - Handling unknown degradation types automatically

For videos: Processes each frame individually then reconstructs them into a coherent sequence

How It Works

Trained using a realistic degradation model that simulates: - Multiple blur types (motion, lens, sensor) - Mixed noise patterns (Gaussian + camera-specific) - Various downsampling methods - JPEG compression artifacts

Unlike ESRGAN and other GAN-based methods that assume ideal degradation scenarios, BSRGAN specifically addresses the unpredictable quality loss found in real-world images and footage.

Key Features

  • Blind Restoration: Works without prior knowledge of image degradation
  • Frame Consistency: Maintains temporal coherence when processing video
  • Flexible Input: Accepts images (JPG/PNG) and videos (MP4/MOV)
  • Quality Control: Adjustable parameters for noise reduction and sharpness

Common Use Cases

  1. Restoring historical photos/family albums
  2. Upscaling low-resolution surveillance footage
  3. Improving compressed social media content
  4. Pre-processing for archival digitization projects
  5. Enhancing video conference quality

Limitations

  • Maximum resolution: 1536x1536 for images
  • Recommended video length: <2 minutes
  • Processing time varies by hardware (~10 sec/image on GPUs)
  • May struggle with extreme degradation cases

Credits

Developed by researchers at ETH Zurich:

@inproceedings{zhang2021designing,
  title={Designing a Practical Degradation Model for Deep Blind Image Super-Resolution},
  author={Zhang, Kai and Liang, Jingyun and Van Gool, Luc and Timofte, Radu},
  booktitle={ICCV},
  year={2021}
}

Original implementation: cszn/BSRGAN on GitHub
Maintained by @zsxkib for Replicate integration


Note: Video processing works by applying the image model frame-by-frame. For optimal results, combine with temporal stabilization tools in post-processing.