Automatically detect and remove Sora watermarks from videos using YOLOv11-based detection and fast cv2 inpainting.
What It Does
This model processes videos to: 1. Detect Sora watermarks frame-by-frame using a trained YOLOv11s detector 2. Remove detected watermarks using fast OpenCV inpainting (cv2) 3. Output a clean video without watermarks
The system uses ROI-only processing for speed, only processing watermark regions rather than entire frames.
Inputs
| Input | Type | Description | Required | Default |
|---|---|---|---|---|
video |
string | URL of the video file to process (must be publicly accessible) | ✅ Yes | - |
tos_access_key_id |
secret | TOS (Object Storage) access key ID for uploading result. If provided, returns object key instead of file | ❌ No | None |
tos_access_key_secret |
secret | TOS (Object Storage) access key secret. Required if tos_access_key_id is provided |
❌ No | None |
Note: Maximum video duration is 15 seconds by default (configurable in the model).
Output
The model returns: - Replicate file url (if no TOS credentials provided): The MP4 video file url hosted on replicate
- TOS object key (if TOS credentials provided): String key to the uploaded video in your TOS bucket
Performance
- Processing Speed: ~2-5 seconds per second of video (depending on watermark frequency and video resolution)
- Memory Usage: Optimized for GPU/CPU with ROI-only processing
- Video Format: Input/output MP4 (H.264)
Limitations
-
Video Duration: Maximum 15 seconds per video
-
Video Format: Currently optimized for MP4 format
-
Watermark Type: Specifically trained for Sora watermarks (flower-like pattern)
-
Quality: Fast cv2 inpainting may leave slight artifacts on complex backgrounds (for better quality, use slower ML models like LAMA)
How It Works
- Detection: Uses YOLOv11s to detect watermark bounding boxes in each frame
- Tracking: Implements temporal tracking to maintain detection consistency across frames
- ROI Extraction: Extracts regions of interest around detected watermarks
- Inpainting: Uses cv2.inpaint to fill watermark regions using surrounding pixels
- Blending: Gaussian blending for seamless integration
- Output: Assembles cleaned frames into output video
Troubleshooting
Video download fails: - Ensure the video URL is publicly accessible - Check that the URL returns a valid MP4 file - Verify network connectivity from Replicate servers
Processing fails: - Ensure video duration is ≤ 15 seconds - Check video format is MP4 - Verify video has valid encoding
License
Apache License 2.0
Acknowledgments
This project is built upon the excellent work of the following projects:
-
SoraWatermarkCleaner: This Replicate model is based on the SoraWatermarkCleaner project. Special thanks to linkedlist771 for creating the original implementation and detector model.
-
IOPaint: We use components from IOPaint for the inpainting implementation. Special thanks to Sanster and the IOPaint contributors for their amazing work on image inpainting models and pipelines.
-
Ultralytics YOLO: Detection capabilities are powered by Ultralytics YOLO for object detection. We thank the Ultralytics team for their robust and efficient YOLOv11 implementation.