wd-vit-large-tagger-v3

Maintained By
SmilingWolf

WD ViT-Large Tagger v3

PropertyValue
Parameter Count315M
Model TypeVision Transformer
LicenseApache 2.0
Tensor TypeF32
Frameworktimm, ONNX, Safetensors

What is wd-vit-large-tagger-v3?

WD ViT-Large Tagger v3 is a state-of-the-art image tagging model built on the Vision Transformer architecture. Trained on the extensive Danbooru dataset, it specializes in identifying and tagging anime and manga-style images with high precision. The model represents a significant upgrade from its predecessors, featuring enhanced compatibility with the timm library and improved batch processing capabilities.

Implementation Details

The model was trained using the JAX-CV framework with TPU support from the TRC program. It processes images from the Danbooru dataset up to ID 7220105, utilizing a specific training-validation split strategy. The training set includes images with IDs modulo 0000-0899, while validation uses IDs modulo 0950-0999.

  • Achieves F1 score of 0.4674 at threshold 0.2606
  • Supports batch inference in ONNX format
  • Requires onnxruntime >= 1.17.0
  • Compatible with timm library for easy integration

Core Capabilities

  • Comprehensive tagging support for ratings, characters, and general tags
  • Filtered training on high-quality data (10+ general tags per image)
  • Tag coverage for items with 600+ image examples
  • Updated tag database through February 2024

Frequently Asked Questions

Q: What makes this model unique?

This model combines the power of Vision Transformers with extensive anime/manga domain knowledge, offering improved batch processing and broader framework compatibility compared to previous versions. Its training on a carefully curated dataset ensures high-quality tag predictions.

Q: What are the recommended use cases?

The model is ideal for automated tagging of anime and manga-style images, content organization, and database management. It's particularly useful for large-scale image classification tasks where accurate tag prediction is crucial.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.