levit_128.fb_dist_in1k

Maintained By
timm

LeViT-128 Vision Transformer

PropertyValue
Parameter Count9.21M
Model TypeVision Transformer (ConvNet-style)
LicenseApache-2.0
Image Size224x224
Top-1 Accuracy78.474%
GMACs0.4

What is levit_128.fb_dist_in1k?

LeViT-128 is a vision transformer architecture designed by Facebook Research that combines the benefits of transformers with convolutional neural networks. It's specifically optimized for faster inference while maintaining competitive accuracy. This model represents a balanced compromise between model size and performance, with 9.21M parameters and 78.474% top-1 accuracy on ImageNet-1k.

Implementation Details

The model implements a hybrid architecture that uses convolutional operations (nn.Conv2d and nn.BatchNorm2d) while maintaining transformer-like attention mechanisms. It's been trained using knowledge distillation on the ImageNet-1k dataset, achieving efficient feature extraction with only 0.4 GMACs of computational requirement.

  • Optimized activation size of 2.7M
  • Efficient inference architecture
  • Distillation-based training approach
  • Convolutional-style implementation for better hardware utilization

Core Capabilities

  • Image classification with 1000 classes
  • Feature extraction backbone functionality
  • Efficient inference on standard hardware
  • Balanced performance-to-size ratio

Frequently Asked Questions

Q: What makes this model unique?

LeViT-128 stands out for its hybrid approach that combines transformer architecture with convolutional operations, optimized specifically for inference speed while maintaining good accuracy. It represents an excellent balance between model size and performance.

Q: What are the recommended use cases?

This model is particularly well-suited for production environments where inference speed is crucial but accuracy cannot be significantly compromised. It's ideal for real-time image classification tasks, feature extraction, and as a backbone for more complex computer vision applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.