DanTagGen-beta
Property | Value |
---|---|
Architecture | LLaMA (400M parameters) |
License | OpenRAIL |
Training Data | 5.3M Danbooru datasets |
Language | English |
What is DanTagGen-beta?
DanTagGen-beta is an advanced AI model designed for generating detailed image tags in the Danbooru style. Built upon a 400M parameter LLaMA architecture, it represents a significant improvement over its alpha predecessor, offering enhanced stability and superior tag generation capabilities even with minimal input information.
Implementation Details
The model utilizes a custom-trained LLaMA architecture (dubbed NanoLLaMA) and is compatible with various LLaMA inference interfaces. It was trained from scratch over 10 epochs on 5.3M data points, accumulating approximately 6-12B tokens of training exposure. The model supports both FP16 GGUF format and quantized 8bit/6bit versions for optimal performance.
- Trained on filtered dataset based on favorite count percentiles
- Implements standardized input format with rating, artist, characters, and aspect ratio fields
- Supports both short and long-form tag generation
- Compatible with llama.cpp and llama-cpp-python for efficient inference
Core Capabilities
- Generates comprehensive image tags from minimal input prompts
- Handles multiple aspects including character features, compositions, and artistic elements
- Provides better coherence and detail compared to the alpha version
- Supports various image contexts and art styles
Frequently Asked Questions
Q: What makes this model unique?
DanTagGen-beta stands out for its ability to generate detailed and coherent tag sets from minimal input, trained on a carefully curated dataset of 5.3M entries. Its NanoLLaMA architecture provides efficient inference while maintaining high-quality outputs.
Q: What are the recommended use cases?
The model is particularly suited for artistic content tagging, character description generation, and automated image annotation in the anime/manga art style. It's especially useful for content creators and developers working with image databases requiring detailed tagging systems.