Readme

SAM 3.1 on Replicate

Segment Anything Model 3.1 — Unified Promptable Segmentation

Paper: arXiv 2511.16719 | Authors: Meta FAIR (Carion, Gustafson, Hu, et al.) | Code: github.com/facebookresearch/sam3 | Model: huggingface.co/facebook/sam3.1 | License: Meta SAM License

SAM 3.1 segments any object in an image using text prompts, point clicks, or bounding boxes. It detects 270K+ visual concepts — 50x more than prior benchmarks. SAM 3.1 adds Object Multiplex for ~7x faster multi-object tracking.

Works on both indoor and outdoor scenes.

Default Example

import replicate

output = replicate.run("visionaix/sam3-1", input={
    "image": "https://cdn.sanity.io/images/k55su7ch/production2/d9e35a73891d43ccb0bc665bf2e0d5d9d6f1ea2b-4200x2363.jpg?w=1920&q=75&auto=format",
    "text_prompt": "couch",
})
# output.masked_image - highlighted segmentation
# output.masks_overlay - overlay with scores and labels
# output.calibration_json - metadata

Prompt Types

Text Prompt (open-vocabulary, 270K+ concepts)

output = replicate.run("visionaix/sam3-1", input={
    "image": "photo.jpg",
    "text_prompt": "yellow school bus",
    "confidence_threshold": 0.5,
})

Point Prompt (click foreground/background)

output = replicate.run("visionaix/sam3-1", input={
    "image": "photo.jpg",
    "point_coords": "[[520, 375]]",
    "point_labels": "[1]",
})

Box Prompt (bounding box)

output = replicate.run("visionaix/sam3-1", input={
    "image": "photo.jpg",
    "box_prompt": "[100, 200, 400, 500]",
})

Inputs

Parameter	Type	Default	Description
`image`	File	required	Input image
`text_prompt`	String	`""`	Text describing what to segment
`point_coords`	String	`""`	JSON array of [x,y] pixel coords
`point_labels`	String	`""`	JSON array: 1=foreground, 0=background
`box_prompt`	String	`""`	JSON [x_min, y_min, x_max, y_max]
`confidence_threshold`	Float	`0.5`	Min confidence for text detections
`multimask_output`	Boolean	`false`	Return 3 candidate masks (point/box)
`return_raw_masks`	Boolean	`false`	Return raw masks as .npy

Outputs

masked_image — Original image with segmented regions highlighted
masks_overlay — Overlay with contours, scores, and labels
calibration_json — Metadata (scores, boxes, timing)
raw_masks — Binary mask array as .npy (optional)

Citation

@misc{carion2025sam3,
  title={SAM 3: Segment Anything with Concepts},
  author={Carion, Nicolas and Gustafson, Laura and Hu, Yuan-Ting and others},
  year={2025},
  eprint={2511.16719},
  archivePrefix={arXiv},
}

License

Meta SAM License. See LICENSE.

Model created 2 months, 3 weeks ago