edgenext_small.usi_in1k

edgenext_small.usi_in1k

timm

EdgeNeXt small model optimized for mobile vision - 5.59M params, ImageNet trained with USI distillation, delivers efficient CNN-Transformer hybrid architecture

PropertyValue
Parameter Count5.59M
Model TypeImage Classification / Feature Backbone
LicenseMIT
Training DatasetImageNet-1k
ArchitectureCNN-Transformer Hybrid
PaperEdgeNeXt Paper

What is edgenext_small.usi_in1k?

EdgeNeXt Small is an efficiently designed hybrid architecture that combines the benefits of CNNs and Transformers, specifically optimized for mobile vision applications. This variant has been trained using the USI (Unified Scheme for Training) methodology on ImageNet-1k, incorporating knowledge distillation techniques to achieve superior performance despite its compact size.

Implementation Details

The model operates with 5.6M parameters and requires only 1.3 GMACs for inference. It processes images at 256x256 resolution during training and 320x320 during testing, maintaining a balance between computational efficiency and accuracy. The architecture features progressive channel dimensions (48→96→160→304) across different stages of the network.

  • Efficient hybrid architecture combining CNN and Transformer elements
  • Optimized for mobile vision applications
  • Trained using advanced USI distillation techniques
  • Supports feature map extraction at multiple scales

Core Capabilities

  • Image classification on ImageNet-1k dataset
  • Feature extraction with multiple output scales
  • Image embedding generation
  • Support for both inference and feature backbone usage

Frequently Asked Questions

Q: What makes this model unique?

EdgeNeXt Small stands out for its efficient amalgamation of CNN and Transformer architectures, specifically designed for mobile applications. The USI training methodology and knowledge distillation techniques enable it to achieve competitive performance with just 5.59M parameters.

Q: What are the recommended use cases?

The model is ideal for mobile vision applications requiring efficient image classification, feature extraction, or embedding generation. It's particularly suitable for scenarios where computational resources are limited but high performance is required.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026