ayumuakagi / segment_anything_model

The API automatically detects objects in an input image and returns their positional and mask information.

  • Public
  • 4K runs
  • GitHub
  • Paper
  • License

SegmentAnythingModel API

The API automatically detects objects in an input image and returns their positional and mask information.

Overview

This API uses the Segment Anything Model (SAM) to automatically detect objects in images and retrieve information such as image size, object coordinates, area, and mask information. For more details on SAM, please refer to the official repository.

Example Code

def call_sam_api(image_data):
    image = encode_image_to_base64(image_data)
    input = {
        "image": image,
        "iou_threshold": 0.95
    }
    output = replicate.run(  "ayumuakagi/segment_anything_model:e0d5c56062fb1dc6ed738e09997b421442e0e86983052de6861b82b0f05c6876",
        input=input
    )
    return output

Response Example

{
  "device": "cuda",
  "shapes": [
    702,
    936,
    3
  ],
  "status": "success",
  "num_masks": 30,
  "mask_details": [
    {
      "iou": 1.0258104801177979,
      "area": 15151,
      "bbox": [
        362,
        466,
        162,
        107
      ],
      "segmentation": "eJzt1UuOnEAAREHP/S/tjTeeLqD9fTlSxBpRVEpPfPsG・・・・・・・",
      "stability_score": 0.985707700252533
    }
  ]
}

Note: The segmentation information is Base64 encoded and needs to be decoded. After decoding, convert it into a NumPy array matching the image shape.

Decoding Segmentation Example

import base64
import zlib
import numpy as np

def decode_segmentation(encoded_data, shape):
    # Base64 decode
    compressed_data = base64.b64decode(encoded_data)
    # zlib decompress
    decompressed_data = zlib.decompress(compressed_data)
    # Convert to numpy array and reshape to original shape
    segmentation = np.frombuffer(decompressed_data, dtype=bool).reshape(shape)
    return segmentation

This README provides a clear and concise guide for users to understand and utilize the SegmentAnythingModel API effectively.