zhyjiong/sora_watermark_remover

Remove 'Sora' watermark on videos generated by sora

Public
57 runs

Run time and cost

This model runs on CPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Automatically detect and remove Sora watermarks from videos using YOLOv11-based detection and fast cv2 inpainting.

What It Does

This model processes videos to: 1. Detect Sora watermarks frame-by-frame using a trained YOLOv11s detector 2. Remove detected watermarks using fast OpenCV inpainting (cv2) 3. Output a clean video without watermarks

The system uses ROI-only processing for speed, only processing watermark regions rather than entire frames.

Inputs

Input Type Description Required Default
video string URL of the video file to process (must be publicly accessible) ✅ Yes -
tos_access_key_id secret TOS (Object Storage) access key ID for uploading result. If provided, returns object key instead of file ❌ No None
tos_access_key_secret secret TOS (Object Storage) access key secret. Required if tos_access_key_id is provided ❌ No None

Note: Maximum video duration is 15 seconds by default (configurable in the model).

Output

The model returns: - Replicate file url (if no TOS credentials provided): The MP4 video file url hosted on replicate

  • TOS object key (if TOS credentials provided): String key to the uploaded video in your TOS bucket

Performance

  • Processing Speed: ~2-5 seconds per second of video (depending on watermark frequency and video resolution)
  • Memory Usage: Optimized for GPU/CPU with ROI-only processing
  • Video Format: Input/output MP4 (H.264)

Limitations

  1. Video Duration: Maximum 15 seconds per video

  2. Video Format: Currently optimized for MP4 format

  3. Watermark Type: Specifically trained for Sora watermarks (flower-like pattern)

  4. Quality: Fast cv2 inpainting may leave slight artifacts on complex backgrounds (for better quality, use slower ML models like LAMA)

How It Works

  1. Detection: Uses YOLOv11s to detect watermark bounding boxes in each frame
  2. Tracking: Implements temporal tracking to maintain detection consistency across frames
  3. ROI Extraction: Extracts regions of interest around detected watermarks
  4. Inpainting: Uses cv2.inpaint to fill watermark regions using surrounding pixels
  5. Blending: Gaussian blending for seamless integration
  6. Output: Assembles cleaned frames into output video

Troubleshooting

Video download fails: - Ensure the video URL is publicly accessible - Check that the URL returns a valid MP4 file - Verify network connectivity from Replicate servers

Processing fails: - Ensure video duration is ≤ 15 seconds - Check video format is MP4 - Verify video has valid encoding

License

Apache License 2.0

Acknowledgments

This project is built upon the excellent work of the following projects:

  • SoraWatermarkCleaner: This Replicate model is based on the SoraWatermarkCleaner project. Special thanks to linkedlist771 for creating the original implementation and detector model.

  • IOPaint: We use components from IOPaint for the inpainting implementation. Special thanks to Sanster and the IOPaint contributors for their amazing work on image inpainting models and pipelines.

  • Ultralytics YOLO: Detection capabilities are powered by Ultralytics YOLO for object detection. We thank the Ultralytics team for their robust and efficient YOLOv11 implementation.