ConvNeXtV2-Tiny Model
Property | Value |
---|---|
Parameters | 28.6M |
GMACs | 4.5 |
Training Image Size | 224x224 |
Testing Image Size | 288x288 |
Paper | ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders |
What is convnextv2_tiny.fcmae_ft_in22k_in1k?
This is a lightweight implementation of the ConvNeXt V2 architecture, specifically designed for efficient image classification. The model has been pretrained using a fully convolutional masked autoencoder (FCMAE) framework and subsequently fine-tuned on ImageNet-22k and ImageNet-1k datasets, achieving impressive performance with relatively modest computational requirements.
Implementation Details
The ConvNeXtV2-Tiny model represents a careful balance between efficiency and performance, featuring 28.6M parameters and requiring 4.5 GMACs for inference. It processes images at 224x224 resolution during training and 288x288 during testing, with 13.4M activations.
- Utilizes advanced FCMAE pretraining methodology
- Dual fine-tuning strategy on ImageNet-22k and ImageNet-1k
- Optimized architecture for mobile and edge deployments
- Supports feature extraction and embedding generation
Core Capabilities
- Image classification with high accuracy
- Feature map extraction at multiple scales
- Generation of image embeddings
- Efficient inference with modest computational requirements
Frequently Asked Questions
Q: What makes this model unique?
The model combines the innovative ConvNeXt V2 architecture with FCMAE pretraining, offering an excellent balance between performance and efficiency. Its dual fine-tuning approach on both ImageNet-22k and ImageNet-1k datasets provides robust feature learning capabilities.
Q: What are the recommended use cases?
This model is particularly well-suited for applications requiring efficient image classification, feature extraction, or embedding generation, especially in scenarios where computational resources are limited but high accuracy is still required. It's ideal for mobile applications, edge devices, and general-purpose computer vision tasks.