ayumuakagi / segment_anything_model

The API automatically detects objects in an input image and returns their positional and mask information.

  • Public
  • 4K runs
  • GitHub
  • Paper
  • License

Run time and cost

This model costs approximately $0.0016 to run on Replicate, or 625 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 8 seconds. The predict time for this model varies significantly based on the inputs.

Readme

SegmentAnythingModel API

The API automatically detects objects in an input image and returns their positional and mask information.

Overview

This API uses the Segment Anything Model (SAM) to automatically detect objects in images and retrieve information such as image size, object coordinates, area, and mask information. For more details on SAM, please refer to the official repository.

Example Code

def call_sam_api(image_data):
    image = encode_image_to_base64(image_data)
    input = {
        "image": image,
        "iou_threshold": 0.95
    }
    output = replicate.run(  "ayumuakagi/segment_anything_model:e0d5c56062fb1dc6ed738e09997b421442e0e86983052de6861b82b0f05c6876",
        input=input
    )
    return output

Response Example

{
  "device": "cuda",
  "shapes": [
    702,
    936,
    3
  ],
  "status": "success",
  "num_masks": 30,
  "mask_details": [
    {
      "iou": 1.0258104801177979,
      "area": 15151,
      "bbox": [
        362,
        466,
        162,
        107
      ],
      "segmentation": "eJzt1UuOnEAAREHP/S/tjTeeLqD9fTlSxBpRVEpPfPsG・・・・・・・",
      "stability_score": 0.985707700252533
    }
  ]
}

Note: The segmentation information is Base64 encoded and needs to be decoded. After decoding, convert it into a NumPy array matching the image shape.

Decoding Segmentation Example

import base64
import zlib
import numpy as np

def decode_segmentation(encoded_data, shape):
    # Base64 decode
    compressed_data = base64.b64decode(encoded_data)
    # zlib decompress
    decompressed_data = zlib.decompress(compressed_data)
    # Convert to numpy array and reshape to original shape
    segmentation = np.frombuffer(decompressed_data, dtype=bool).reshape(shape)
    return segmentation

This README provides a clear and concise guide for users to understand and utilize the SegmentAnythingModel API effectively.