WD 1.4 MOAT Tagger V2

Property	Value
Framework	TF-Keras, ONNX
License	Apache-2.0
Paper	MOAT: Alternating Mobile Convolution and Attention
Validation F1 Score	0.6911

What is wd-v1-4-moat-tagger-v2?

The WD 1.4 MOAT Tagger V2 is an advanced image tagging model that leverages the MOAT (Mobile Convolution and Attention) architecture to provide comprehensive image tagging capabilities. Developed by SmilingWolf, this model has been specifically trained on the Danbooru dataset to recognize and tag images with ratings, characters, and general attributes.

Implementation Details

The model was trained using a carefully curated subset of the Danbooru dataset, specifically images with IDs modulo 0000-0899, while validation was performed on images with IDs modulo 0950-0999. The training process incorporated several key filtering criteria to ensure quality:

Images with fewer than 10 general tags were excluded
Tags appearing in fewer than 600 images were filtered out
Training utilized TPUs provided by the TRC program
Achieved optimal performance with a threshold of 0.3771

Core Capabilities

Multi-category tagging support (ratings, characters, general tags)
High-precision tag recognition with F1 score of 0.6911
Efficient processing through TF-Keras and ONNX compatibility
Robust performance on diverse image content

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its implementation of the MOAT architecture, which combines mobile convolution with attention mechanisms to achieve strong vision modeling capabilities. Its training on a carefully curated Danbooru dataset and high F1 score make it particularly effective for anime and illustration tagging tasks.

Q: What are the recommended use cases?

The model is ideal for automated image tagging systems, particularly those dealing with anime-style artwork and illustrations. It's especially useful for content management systems, digital art platforms, and image organization tools that require accurate tag prediction.

wd-v1-4-moat-tagger-v2