manga109_yolo

manga109_yolo

deepghs

YOLO-based manga element detection model with variants (nano to extra-large) for detecting body, face, frame and text in manga, achieving 88-92% F1 scores.

PropertyValue
Model TypeYOLO (You Only Look Once)
Latest Versionv2023.12.07
Model VariantsNano to Extra Large
RepositoryHugging Face

What is manga109_yolo?

manga109_yolo is a specialized computer vision model designed for manga image analysis. It utilizes the YOLO architecture to detect and classify four key elements in manga pages: body, face, frame, and text. The model comes in multiple variants optimized for different computational requirements, from lightweight nano versions to high-capacity extra-large versions.

Implementation Details

The model family includes multiple variants with different parameter counts and computational requirements: Extra Large (258G FLOPS, 68.2M params), Large (165G FLOPS, 43.6M params), Medium (79.1G FLOPS, 25.9M params), Small (28.7G FLOPS, 11.1M params), and Nano (8.2G FLOPS, 3.01M params). All variants maintain impressive F1 scores ranging from 0.88 to 0.92.

  • Advanced object detection architecture using YOLO framework
  • Optimized for manga-specific element detection
  • Multiple model sizes for different deployment scenarios
  • High precision and recall across all variants

Core Capabilities

  • Body Detection: Identifies character bodies in manga panels
  • Face Detection: Locates and identifies character faces
  • Frame Detection: Recognizes panel boundaries and layout elements
  • Text Detection: Identifies text areas including speech bubbles and captions
  • High Performance: Achieves mAP50 scores of up to 0.95 on benchmark datasets

Frequently Asked Questions

Q: What makes this model unique?

The model's specialization in manga content analysis, combined with its variety of deployment options from nano to extra-large variants, makes it uniquely versatile for manga processing applications. Its high F1 scores (0.88-0.92) across all variants demonstrate robust performance regardless of model size.

Q: What are the recommended use cases?

This model is ideal for manga digitization projects, content analysis, automated translation preprocessing, and manga research applications. The different variants allow users to choose between computational efficiency and maximum accuracy based on their specific needs.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026