maxvit_nano_rw_256.sw_in1k

Maintained By
timm

MaxViT Nano RW 256

PropertyValue
Parameter Count15.45M
Top-1 Accuracy82.93%
Image Size256x256
LicenseApache 2.0
PaperMaxViT: Multi-Axis Vision Transformer

What is maxvit_nano_rw_256.sw_in1k?

maxvit_nano_rw_256.sw_in1k is a lightweight variant of the MaxViT architecture, specifically optimized for 256x256 resolution images. It implements a hybrid approach combining convolutional neural networks and transformer architectures, achieving an impressive balance between model size (15.45M parameters) and performance (82.93% top-1 accuracy on ImageNet-1k).

Implementation Details

The model utilizes a multi-axis attention mechanism that combines both local and global feature processing. It's built on the MaxViT architecture which incorporates:

  • MBConv (depthwise-separable) convolution blocks
  • Dual self-attention mechanisms with window and grid partitioning
  • Optimized for PyTorch with RW (Ross Wightman) specific configurations
  • 4.46 GMACs computational complexity
  • 30.28M activations

Core Capabilities

  • Image Classification on ImageNet-1k dataset
  • Feature extraction with multiple resolution outputs
  • Efficient processing with 1,218.17 samples/sec throughput
  • Balanced performance for edge deployment scenarios

Frequently Asked Questions

Q: What makes this model unique?

This model represents an optimal trade-off between model size and performance, specifically designed for scenarios requiring efficient inference on 256x256 images. Its unique multi-axis attention mechanism allows it to capture both local and global features effectively while maintaining a relatively small parameter count.

Q: What are the recommended use cases?

The model is well-suited for: 1) Resource-constrained environments requiring decent classification performance, 2) Real-time image classification tasks, 3) Feature extraction for downstream computer vision tasks, and 4) Scenarios where 256x256 resolution is sufficient for the application needs.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.