WD EVA02-Large Tagger v3

Property	Value
Parameter Count	315M
License	Apache-2.0
Tensor Type	F32
Framework	timm, ONNX, Safetensors

What is wd-eva02-large-tagger-v3?

The WD EVA02-Large Tagger v3 is an advanced image tagging model specifically designed for comprehensive content classification. Developed by SmilingWolf using TPUs provided by the TRC program, this model represents a significant evolution in automated image tagging capabilities.

Implementation Details

The model was trained on Danbooru images with specific ID modulo ranges (0000-0899 for training, 0950-0999 for validation). It achieves a validation F1 score of 0.4772 at a threshold of 0.5296, demonstrating robust performance in tag prediction.

Comprehensive dataset coverage up to February 2024
Minimum threshold of 10 general tags per image
Tags filtered to include only those with 600+ image examples
Compatible with timm framework for easy integration
Flexible ONNX implementation supporting batch inference

Core Capabilities

Multi-category tagging support (ratings, characters, general tags)
Batch processing capabilities
High-performance inference with ONNX runtime (≥ 1.17.0)
Seamless integration with multiple frameworks (JAX, timm, ONNX)

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its comprehensive training on a large-scale dataset with strict quality controls and multi-framework compatibility. The use of Macro-F1 for performance measurement ensures balanced evaluation across all tag categories.

Q: What are the recommended use cases?

The model is ideal for automated image tagging systems, content classification, and large-scale image database organization. It's particularly well-suited for applications requiring detailed anime and illustration content analysis.