WD SwinV2 Tagger v3

Property	Value
Parameter Count	98M
License	Apache-2.0
Tensor Type	F32
Framework	timm, ONNX, Safetensors

What is wd-swinv2-tagger-v3?

WD SwinV2 Tagger v3 is a sophisticated image tagging model specifically designed for anime and manga content analysis. Built using the SwinV2 architecture, this model represents a significant advancement in automated content tagging, trained on an extensive Danbooru dataset up to image ID 7220105.

Implementation Details

The model utilizes a frequency-based loss scaling approach to address class imbalance issues, trained on Danbooru images with IDs modulo 0000-0899 and validated on images with IDs modulo 0950-0999. It achieves an impressive F1 score of 0.4541 at a threshold of 0.2653, showing improvement over previous versions.

Comprehensive training on images with 10+ general tags
Tag filtering threshold of 600+ images
Compatible with timm and ONNX runtime (>= 1.17.0)
Flexible batch processing capability

Core Capabilities

Ratings classification
Character recognition
General tag generation
Batch inference support
Cross-platform compatibility

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its optimized performance through frequency-based loss scaling and extensive dataset coverage, making it particularly effective for anime/manga content tagging. The flexible architecture supports multiple frameworks and batch processing capabilities.

Q: What are the recommended use cases?

The model is ideal for automated content tagging in anime/manga collections, content moderation systems, and large-scale media organization tasks. It's particularly effective when integrated into systems requiring accurate character recognition and general content classification.