One-Align

Property	Value
License	MIT
Paper	arxiv.org/pdf/2312.17090
Framework	PyTorch
Task	Zero-Shot Image Classification

What is one-align?

One-align is a groundbreaking unified model that combines Image Quality Assessment (IQA), Image Aesthetic Assessment (IAA), and Video Quality Assessment (VQA) capabilities. It represents a significant advancement in automated visual content evaluation, achieving state-of-the-art performance across multiple benchmarks.

Implementation Details

The model is implemented using PyTorch and the Transformers library (version 4.36.1). It utilizes a causal language model architecture and can be easily deployed using the AutoModel interface. The model operates in float16 precision and supports automatic device mapping for optimal performance.

Supports both image and video quality assessment
Provides scores in the range of [1,5] for quality assessment
Implements efficient attention mechanisms with eager execution

Core Capabilities

State-of-the-art performance on KonIQ, SPAQ, and KADID datasets
Superior results on unseen datasets like LIVE-C, LIVE, CSIQ, and AGIQA
Exceptional performance in video quality assessment on LSVQ, KoNViD-1k, and MaxWell datasets
Strong aesthetic assessment capabilities demonstrated on AVA_test dataset

Frequently Asked Questions

Q: What makes this model unique?

One-align is unique in its ability to handle multiple visual assessment tasks within a single unified framework, achieving superior performance across IQA, IAA, and VQA tasks compared to specialized models.

Q: What are the recommended use cases?

The model is ideal for automatic quality assessment of images and videos, aesthetic evaluation of visual content, and zero-shot image classification tasks. It's particularly valuable for content moderation, digital asset management, and automated visual content analysis.

one-align