eiby777/manga_globes

Detect and classify speech bubbles in manga images

Public
11 runs

Run time and cost

This model runs on CPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This ONNX-based object detection model exported from Darknet Yolo specializes in identifying and classifying speech bubbles (globes) in manga images. It detects five types of speech bubbles:

normal: Standard speech bubbles
scream: Exclamation or shouting bubbles
touched: Bubbles with motion lines or emphasis
think: Thought bubbles
sentence: Narrative text boxes

The model takes an input image and returns bounding boxes with confidence scores for each detected speech bubble, scaled to the original image dimensions. It uses ONNX Runtime for efficient CPU inference and includes post-processing with Non-Maximum Suppression (NMS) to eliminate overlapping detections.

Key Features:

Optimized for manga-style artwork
5-class classification of speech bubble types
Configurable confidence and IoU thresholds
CPU-based inference (no GPU required)
Fast processing suitable for real-time applications

Use Cases:

Manga translation and localization
Comic book analysis
Automated text extraction from Japanese comics
Content moderation for manga images