Content Moderation Vision AI

An advanced content moderation system powered by MiniCPM-V-2.6 that analyzes images for inappropriate content and returns structured safety assessments.

Features

Intelligent Content Analysis: Uses state-of-the-art vision-language model MiniCPM-V-2.6
Structured JSON Responses: Returns detailed safety scores (0-4) with classifications
Comprehensive Categories: Detects SAFE, ADULT_THEMES, NSFW, HATE_CONTENT, VIOLENCE, HARMFUL content
Custom Prompts: Supports both automated moderation and custom image analysis
GPU Optimized: Efficient CUDA acceleration with automatic memory management

Content Classifications

Score 0 (SAFE): Completely safe content - no concerns for any audience
Score 1 (ADULT_THEMES): Minor adult themes, revealing but clothed content
Score 2 (NSFW): Moderate concerns requiring age-appropriate context
Score 3 (INAPPROPRIATE): Explicit content with visible private parts, hate symbols
Score 4 (HARMFUL): Severely harmful content requiring immediate action

API Usage

Content Moderation Mode (Default)

import replicate

output = replicate.run(
    "kojott/content-moderation-vision",
    input={"image": "https://example.com/image.jpg"}
)
print(output)  # Returns structured JSON

Custom Prompt Mode

output = replicate.run(
    "kojott/content-moderation-vision",
    input={
        "image": "https://example.com/image.jpg",
        "prompt": "Describe what you see in this image"
    }
)

Response Format

{
  "score": 0,
  "classification": "SAFE",
  "description": "A beautiful landscape photo showing mountains and trees",
  "concerns": [],
  "safe_for_children": true,
  "requires_restriction": false,
  "admin_notes": "Natural landscape content, completely appropriate"
}

Parameters

image (required): Image file to analyze
prompt (optional): Custom analysis prompt. If empty, uses content moderation mode
temperature (0.0-1.0): Controls randomness in generation (default: 0.1)
top_p (0.0-1.0): Nucleus sampling parameter (default: 0.9)

Technical Details

Base Model: MiniCPM-V-2.6 (OpenBMB)
Framework: PyTorch with CUDA acceleration
Memory: Optimized for GPU efficiency with automatic cleanup
Response Time: Typically 2-5 seconds per image
Supported Formats: JPEG, PNG, WebP, and other PIL-compatible formats

Use Cases

Social Media Platforms: Automated content screening
E-commerce: Product image validation
Educational Platforms: Child-safe content verification
Community Forums: User-generated content moderation
Dating Apps: Profile photo screening

Model Performance

This model balances accuracy with practical deployment needs, avoiding over-censorship while effectively identifying genuinely harmful content. It’s designed for real-world applications where human-centered judgment is essential.

Built with reliability and production deployment in mind, featuring comprehensive error handling and fallback responses.