xcit_tiny_24_p8_384.fb_dist_in1k

Maintained By
timm

XCiT Tiny 24 P8 384

PropertyValue
Parameter Count12.1M
Image Size384 x 384
LicenseApache-2.0
PaperXCiT: Cross-Covariance Image Transformers
GMACs27.1

What is xcit_tiny_24_p8_384.fb_dist_in1k?

This is a specialized implementation of the Cross-Covariance Image Transformer (XCiT) architecture, designed for high-performance image classification tasks. Developed by Facebook Research, this model represents a lightweight variant with 12.1M parameters, optimized for processing 384x384 pixel images. The model has been pre-trained on ImageNet-1k using knowledge distillation techniques to maintain high accuracy while reducing model size.

Implementation Details

The model utilizes the innovative XCiT architecture, which introduces cross-covariance attention mechanisms to process image data efficiently. With 27.1 GMACs and 133.0M activations, it offers a balanced trade-off between computational efficiency and model performance.

  • Efficient patch-based image processing with P8 patch size
  • 24-layer architecture optimized for 384x384 resolution
  • Distillation-based training for improved performance
  • Support for both classification and feature extraction tasks

Core Capabilities

  • Image classification on ImageNet-1k dataset
  • Feature backbone extraction for downstream tasks
  • Efficient processing of high-resolution images
  • Support for both inference and feature embedding generation

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its implementation of cross-covariance attention mechanisms, which provide efficient processing of high-resolution images while maintaining a relatively small parameter count. The distillation training approach further enhances its performance-to-size ratio.

Q: What are the recommended use cases?

The model is particularly well-suited for image classification tasks requiring high-resolution input (384x384), feature extraction for transfer learning, and scenarios where a balance between model size and performance is crucial. It's ideal for applications requiring both accuracy and computational efficiency.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.