inahip/image-embedding

Generate 64-dimensional embeddings for images using a custom-trained Universal Encoder. Perfect for image similarity search, content-based recommendations, duplicate detection, and visual clustering. Supports any image format with optional L2 normaliza

Public
5 runs

Run time and cost

This model runs on CPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Image Embedding Model

Generate 64-dimensional embeddings for images using a custom-trained Universal Encoder model.

🎯 What is this?

This model converts images into 64-dimensional embedding vectors that capture semantic visual features. These embeddings can be used for:

  • Image Similarity Search: Find visually similar images
  • Content-Based Recommendations: Recommend items based on visual similarity
  • Duplicate Detection: Identify duplicate or near-duplicate images
  • Visual Clustering: Group similar images together
  • Reverse Image Search: Find images similar to a query image

🚀 Quick Start

Python

import replicate

output = replicate.run(
    "inahip/image-embedding",
    input={
        "image": open("photo.jpg", "rb"),
        "normalize": False,
        "return_format": "list"
    }
)

embedding = output["embedding"]  # 64-dimensional vector
print(f"Shape: {output['shape']}")
print(f"Embedding: {embedding[:5]}...")  # First 5 values

cURL

curl -X POST https://api.replicate.com/v1/predictions \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "version": "YOUR_VERSION_ID",
    "input": {
      "image": "https://example.com/image.jpg"
    }
  }'

📥 Inputs

Parameter Type Default Description
image File/URL Required Input image (JPEG, PNG, WebP, etc.)
normalize Boolean false L2-normalize the embedding vector
return_format String "list" Output format: "list" or "base64"

📤 Outputs

Returns a dictionary containing:

{
  "embedding": [0.123, -0.456, 0.789, ...],
  "shape": [64],
  "dtype": "float32",
  "normalized": false,
  "format": "list"
}
  • embedding: 64-dimensional vector (list of floats or base64 string)
  • shape: Vector dimensions [64]
  • dtype: Data type "float32"
  • normalized: Whether L2 normalization was applied
  • format: Output format ("list" or "base64")

🔬 Model Details

  • Architecture: Universal Encoder (Custom CNN-based)
  • Input: RGB images (any size, automatically resized)
  • Output: 64-dimensional float32 vector
  • Normalization: Optional L2 normalization
  • Runtime: CPU (optimized for inference)

💡 Use Cases

import replicate
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Generate embeddings for multiple images
images = ["image1.jpg", "image2.jpg", "image3.jpg"]
embeddings = []

for img_path in images:
    output = replicate.run(
        "inahip/image-embedding",
        input={"image": open(img_path, "rb"), "normalize": True}
    )
    embeddings.append(output["embedding"])

# Compute similarity matrix
embeddings_array = np.array(embeddings)
similarity = cosine_similarity(embeddings_array)
print(similarity)

2. Find Most Similar Image

def find_similar_images(query_embedding, database_embeddings, top_k=5):
    """Find top-k most similar images"""
    similarities = cosine_similarity([query_embedding], database_embeddings)[0]
    top_indices = np.argsort(similarities)[::-1][:top_k]
    return [(idx, similarities[idx]) for idx in top_indices]

# Usage
query_output = replicate.run(
    "inahip/image-embedding",
    input={"image": open("query.jpg", "rb"), "normalize": True}
)
query_emb = query_output["embedding"]

similar_images = find_similar_images(query_emb, database_embeddings)
for idx, score in similar_images:
    print(f"Image {idx}: Similarity = {score:.4f}")

3. Base64 Format (for Storage)

# Get embedding as base64 string
output = replicate.run(
    "inahip/image-embedding",
    input={
        "image": open("photo.jpg", "rb"),
        "return_format": "base64"
    }
)

embedding_base64 = output["embedding"]  # Base64 encoded bytes
print(f"Base64 length: {len(embedding_base64)} chars")

# Decode when needed
import base64
import numpy as np

embedding_bytes = base64.b64decode(embedding_base64)
embedding_array = np.frombuffer(embedding_bytes, dtype=np.float32)
print(f"Decoded shape: {embedding_array.shape}")

📊 Performance

  • Inference Time: ~1-2 seconds per image (CPU)
  • Vector Size: 64 dimensions (256 bytes as float32)
  • Batch Processing: Process multiple images sequentially
  • Memory Usage: ~100MB model size + input image

🎨 Supported Image Formats

  • JPEG / JPG
  • PNG
  • WebP
  • GIF (first frame)
  • BMP
  • TIFF

Images are automatically: - Converted to RGB - Resized to optimal dimensions - Normalized to [0, 1] range

🔧 Advanced Options

L2 Normalization

Normalize embeddings to unit vectors (recommended for similarity search):

output = replicate.run(
    "inahip/image-embedding",
    input={
        "image": open("photo.jpg", "rb"),
        "normalize": True  # L2 normalize
    }
)

# Verify normalization
import numpy as np
embedding = np.array(output["embedding"])
norm = np.linalg.norm(embedding)
print(f"L2 norm: {norm:.6f}")  # Should be ~1.0

Batch Processing

import replicate

def process_image_batch(image_paths):
    """Process multiple images"""
    embeddings = []

    for img_path in image_paths:
        output = replicate.run(
            "inahip/image-embedding",
            input={"image": open(img_path, "rb")}
        )
        embeddings.append(output["embedding"])

    return embeddings

# Process multiple images
image_files = ["img1.jpg", "img2.jpg", "img3.jpg"]
embeddings = process_image_batch(image_files)
print(f"Processed {len(embeddings)} images")

📈 Scaling & Rate Limits

  • Free Tier: 6 predictions/minute with <$5 credit
  • Paid Tier: Higher rate limits with credit top-up
  • Cost: ~$0.00007 per second of compute time
  • Typical Cost: ~$0.0001-0.0002 per image

🐛 Troubleshooting

Error: Image too large

If you encounter size errors, resize your image before uploading:

from PIL import Image

img = Image.open("large_image.jpg")
img = img.resize((1024, 1024), Image.LANCZOS)
img.save("resized.jpg", quality=85)

# Now use the resized image
output = replicate.run(
    "inahip/image-embedding",
    input={"image": open("resized.jpg", "rb")}
)

Error: Rate limit exceeded

Wait for the rate limit to reset (shown in error message) or add credits to your account.

📚 Examples

Building an Image Search Engine

import replicate
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

class ImageSearchEngine:
    def __init__(self):
        self.database = []  # List of (path, embedding) tuples

    def add_image(self, image_path):
        """Add image to database"""
        output = replicate.run(
            "inahip/image-embedding",
            input={"image": open(image_path, "rb"), "normalize": True}
        )
        self.database.append((image_path, output["embedding"]))

    def search(self, query_image_path, top_k=5):
        """Search for similar images"""
        # Get query embedding
        output = replicate.run(
            "inahip/image-embedding",
            input={"image": open(query_image_path, "rb"), "normalize": True}
        )
        query_emb = output["embedding"]

        # Compute similarities
        db_embeddings = [emb for _, emb in self.database]
        similarities = cosine_similarity([query_emb], db_embeddings)[0]

        # Get top-k results
        top_indices = np.argsort(similarities)[::-1][:top_k]
        results = [
            (self.database[idx][0], similarities[idx])
            for idx in top_indices
        ]

        return results

# Usage
search_engine = ImageSearchEngine()

# Add images to database
for img in ["db_img1.jpg", "db_img2.jpg", "db_img3.jpg"]:
    search_engine.add_image(img)

# Search
results = search_engine.search("query.jpg", top_k=3)
for img_path, similarity in results:
    print(f"{img_path}: {similarity:.4f}")

🌟 Tips for Best Results

  1. Use L2 Normalization for similarity comparisons
  2. Consistent Preprocessing: Process all images similarly
  3. Batch Operations: Process multiple images in parallel
  4. Cache Embeddings: Store embeddings in a database
  5. Quality Images: Higher quality = better embeddings

📞 Support

For issues or questions: - Check the Replicate documentation - Join the Replicate Discord - Visit the model page: https://replicate.com/inahip/image-embedding

📄 License

Note: License information will be added soon.


Built with ❤️ using Cog and deployed on Replicate

Model created