camie-tagger
Property | Value |
---|---|
Initial Model Size | 214M parameters |
Refined Model Size | 424M parameters |
Model Type | Anime Image Tagger |
Architecture | Two-stage prediction with EfficientNet V2-L backbone |
Model URL | https://huggingface.co/Camais03/camie-tagger |
What is camie-tagger?
camie-tagger is an advanced deep learning model designed specifically for automatically tagging anime and manga illustrations. Built using a novel two-stage prediction approach, it can identify and classify content across 70,527 possible tags spanning seven distinct categories. The model achieves an impressive 61% F1 score, making it particularly effective for automated content analysis of anime-style artwork.
Implementation Details
The model employs a sophisticated architecture combining an EfficientNet V2-L backbone with a unique two-stage prediction system. The initial stage performs direct classification through a multi-layer classifier, while the refined stage uses cross-attention mechanisms to improve prediction accuracy. Notable is its ability to train effectively on consumer hardware (single RTX 3060) through optimized DeepSpeed configurations and specialized UnifiedFocalLoss function.
- Dual-mode operation supporting both full and initial-only predictions
- Comprehensive tag coverage across 7 categories including general, character, copyright, artist, meta, rating, and year
- Efficient training implementation requiring only 12GB VRAM
- Specialized loss function handling extreme class imbalance
Core Capabilities
- Character Recognition: 74.7% F1 score across 26,968 character tags
- Copyright Detection: 78.5% F1 score for identifying source materials
- General Tag Classification: 60.8% F1 score across 30,841 general tags
- Multiple threshold profiles for precision-recall tradeoff optimization
- Windows compatibility through initial-only mode
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its two-stage prediction architecture that achieves high accuracy without requiring extensive computational resources. It's particularly notable for handling an extremely large tag space (70,527 tags) while maintaining good performance across all categories.
Q: What are the recommended use cases?
The model is ideal for automatic tagging of anime/manga artwork collections, content organization systems, and image databases. It's particularly effective for identifying characters, art styles, and visual elements in anime-style illustrations.