levit_256.fb_dist_in1k

Maintained By
timm

LeViT-256 Vision Transformer

PropertyValue
Parameter Count18.9M
GMACs1.1
Image Size224 x 224
Top-1 Accuracy81.512%
PaperLeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

What is levit_256.fb_dist_in1k?

LeViT-256 is a novel vision transformer architecture that combines the best aspects of transformers and convolutional neural networks. Developed by Facebook Research, this model was trained on ImageNet-1k using knowledge distillation techniques to achieve an optimal balance between performance and efficiency.

Implementation Details

The model implements a hybrid architecture using convolutional operations (nn.Conv2d and nn.BatchNorm2d) while maintaining transformer-like attention mechanisms. With 18.9M parameters and 1.1 GMACs, it delivers efficient inference while processing 224x224 images.

  • Optimized architecture combining CNN and transformer elements
  • Knowledge distillation training approach
  • 4.2M activations for efficient processing
  • Balanced parameter count for mobile-friendly deployment

Core Capabilities

  • Image classification with 81.512% top-1 accuracy
  • Feature extraction backbone functionality
  • Efficient inference with reduced computational overhead
  • Suitable for both classification and embedding generation

Frequently Asked Questions

Q: What makes this model unique?

LeViT-256 stands out by incorporating convolutional operations into a transformer architecture, offering faster inference speeds while maintaining competitive accuracy. It represents a middle-ground solution in the LeViT family, balancing model size and performance.

Q: What are the recommended use cases?

This model is ideal for production environments requiring efficient image classification or feature extraction, particularly where computational resources are constrained but high accuracy is still necessary. It's well-suited for mobile applications and real-time processing scenarios.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.