wd-swinv2-tagger-v3

Maintained By
SmilingWolf

WD SwinV2 Tagger v3

PropertyValue
Parameter Count98M
LicenseApache-2.0
Tensor TypeF32
Frameworktimm, ONNX, Safetensors

What is wd-swinv2-tagger-v3?

WD SwinV2 Tagger v3 is a sophisticated image tagging model specifically designed for anime and manga content analysis. Built using the SwinV2 architecture, this model represents a significant advancement in automated content tagging, trained on an extensive Danbooru dataset up to image ID 7220105.

Implementation Details

The model utilizes a frequency-based loss scaling approach to address class imbalance issues, trained on Danbooru images with IDs modulo 0000-0899 and validated on images with IDs modulo 0950-0999. It achieves an impressive F1 score of 0.4541 at a threshold of 0.2653, showing improvement over previous versions.

  • Comprehensive training on images with 10+ general tags
  • Tag filtering threshold of 600+ images
  • Compatible with timm and ONNX runtime (>= 1.17.0)
  • Flexible batch processing capability

Core Capabilities

  • Ratings classification
  • Character recognition
  • General tag generation
  • Batch inference support
  • Cross-platform compatibility

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its optimized performance through frequency-based loss scaling and extensive dataset coverage, making it particularly effective for anime/manga content tagging. The flexible architecture supports multiple frameworks and batch processing capabilities.

Q: What are the recommended use cases?

The model is ideal for automated content tagging in anime/manga collections, content moderation systems, and large-scale media organization tasks. It's particularly effective when integrated into systems requiring accurate character recognition and general content classification.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.