SAMURAI Object Tracker
A simple-to-use model that tracks objects in videos using SAM 2’s technology. Just point to what you want to track in the first frame, and it will follow that object throughout the video.
How It Works
You provide: - A video file or folder of frames - The starting position (x, y coordinates) and size (width, height) of what you want to track
The model gives you: - A video showing what’s being tracked (with a red highlight) - Frame-by-frame tracking data in a standard format called COCO RLE (which is a space-efficient way to store mask information)
Output Format
The tracking data comes as a dictionary where:
frame_number: [{
"size": [height, width], # Size of the video frame
"counts": "encoded_string", # Mask data in COCO RLE format
"object_id": 0 # ID of the tracked object
}]
Credits
This model is powered by: - SAMURAI by Yang et al. from the University of Washington’s Information Processing Lab - SAM 2 (Segment Anything Model 2) by Meta FAIR - Original paper: “SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory”
License
Apache-2.0
Follow me on Twitter/X