zhyjiong/sora_watermark_remover

Automatically detect and remove Sora watermarks from videos using YOLOv11-based detection and fast cv2 inpainting.

What It Does

This model processes videos to: 1. Detect Sora watermarks frame-by-frame using a trained YOLOv11s detector 2. Remove detected watermarks using fast OpenCV inpainting (cv2) 3. Output a clean video without watermarks

The system uses ROI-only processing for speed, only processing watermark regions rather than entire frames.

Inputs

Input	Type	Description	Required	Default
`video`	string	URL of the video file to process (must be publicly accessible)	✅ Yes	-
`tos_access_key_id`	secret	TOS (Object Storage) access key ID for uploading result. If provided, returns object key instead of file	❌ No	None
`tos_access_key_secret`	secret	TOS (Object Storage) access key secret. Required if `tos_access_key_id` is provided	❌ No	None

Note: Maximum video duration is 15 seconds by default (configurable in the model).

Output

The model returns: - Replicate file url (if no TOS credentials provided): The MP4 video file url hosted on replicate

TOS object key (if TOS credentials provided): String key to the uploaded video in your TOS bucket

Performance

Processing Speed: ~2-5 seconds per second of video (depending on watermark frequency and video resolution)
Memory Usage: Optimized for GPU/CPU with ROI-only processing
Video Format: Input/output MP4 (H.264)

Limitations

Video Duration: Maximum 15 seconds per video
Video Format: Currently optimized for MP4 format
Watermark Type: Specifically trained for Sora watermarks (flower-like pattern)
Quality: Fast cv2 inpainting may leave slight artifacts on complex backgrounds (for better quality, use slower ML models like LAMA)

How It Works

Detection: Uses YOLOv11s to detect watermark bounding boxes in each frame
Tracking: Implements temporal tracking to maintain detection consistency across frames
ROI Extraction: Extracts regions of interest around detected watermarks
Inpainting: Uses cv2.inpaint to fill watermark regions using surrounding pixels
Blending: Gaussian blending for seamless integration
Output: Assembles cleaned frames into output video

Troubleshooting

Video download fails: - Ensure the video URL is publicly accessible - Check that the URL returns a valid MP4 file - Verify network connectivity from Replicate servers

Processing fails: - Ensure video duration is ≤ 15 seconds - Check video format is MP4 - Verify video has valid encoding

License

Apache License 2.0

Acknowledgments

This project is built upon the excellent work of the following projects:

SoraWatermarkCleaner: This Replicate model is based on the SoraWatermarkCleaner project. Special thanks to linkedlist771 for creating the original implementation and detector model.
IOPaint: We use components from IOPaint for the inpainting implementation. Special thanks to Sanster and the IOPaint contributors for their amazing work on image inpainting models and pipelines.
Ultralytics YOLO: Detection capabilities are powered by Ultralytics YOLO for object detection. We thank the Ultralytics team for their robust and efficient YOLOv11 implementation.

Model created 4 months ago

Model updated 3 months, 3 weeks ago