One-Align
Property | Value |
---|---|
License | MIT |
Paper | arxiv.org/pdf/2312.17090 |
Framework | PyTorch |
Task | Zero-Shot Image Classification |
What is one-align?
One-align is a groundbreaking unified model that combines Image Quality Assessment (IQA), Image Aesthetic Assessment (IAA), and Video Quality Assessment (VQA) capabilities. It represents a significant advancement in automated visual content evaluation, achieving state-of-the-art performance across multiple benchmarks.
Implementation Details
The model is implemented using PyTorch and the Transformers library (version 4.36.1). It utilizes a causal language model architecture and can be easily deployed using the AutoModel interface. The model operates in float16 precision and supports automatic device mapping for optimal performance.
- Supports both image and video quality assessment
- Provides scores in the range of [1,5] for quality assessment
- Implements efficient attention mechanisms with eager execution
Core Capabilities
- State-of-the-art performance on KonIQ, SPAQ, and KADID datasets
- Superior results on unseen datasets like LIVE-C, LIVE, CSIQ, and AGIQA
- Exceptional performance in video quality assessment on LSVQ, KoNViD-1k, and MaxWell datasets
- Strong aesthetic assessment capabilities demonstrated on AVA_test dataset
Frequently Asked Questions
Q: What makes this model unique?
One-align is unique in its ability to handle multiple visual assessment tasks within a single unified framework, achieving superior performance across IQA, IAA, and VQA tasks compared to specialized models.
Q: What are the recommended use cases?
The model is ideal for automatic quality assessment of images and videos, aesthetic evaluation of visual content, and zero-shot image classification tasks. It's particularly valuable for content moderation, digital asset management, and automated visual content analysis.