ConvNeXt V2 Large Model

Property	Value
Parameters	198M
GMACs	34.4
Training Image Size	224x224
Test Image Size	288x288
Top-1 Accuracy	87.26%
Paper	ConvNeXt V2 Paper

What is convnextv2_large.fcmae_ft_in22k_in1k?

This is a large-scale ConvNeXt V2 model that represents a significant advancement in convolutional neural network architecture. It was pretrained using a fully convolutional masked autoencoder (FCMAE) framework and subsequently fine-tuned on ImageNet-22k and ImageNet-1k datasets. The model demonstrates impressive performance with 87.26% top-1 accuracy on ImageNet-1k validation.

Implementation Details

The model architecture features 198 million parameters and requires 34.4 GMACs (billion multiply-accumulate operations) for inference. It processes images at 224x224 resolution during training and 288x288 during testing, with 43.1M activations during operation.

Utilizes advanced FCMAE pretraining methodology
Hierarchical feature extraction capabilities
Optimized for both accuracy and computational efficiency
Supports various input resolutions with adaptive pooling

Core Capabilities

Image classification with 1000 classes
Feature extraction for downstream tasks
Generation of image embeddings
Support for both inference and feature map extraction

Frequently Asked Questions

Q: What makes this model unique?

This model stands out due to its FCMAE pretraining approach and dual fine-tuning on ImageNet-22k and ImageNet-1k, resulting in robust feature representations. Its large-scale architecture with 198M parameters provides excellent performance while maintaining reasonable computational requirements.

Q: What are the recommended use cases?

The model excels in high-stakes image classification tasks, transfer learning applications, and as a feature extractor for computer vision tasks. It's particularly suitable for applications requiring high accuracy and robust feature representations, such as medical imaging, industrial inspection, and advanced computer vision systems.

convnextv2_large.fcmae_ft_in22k_in1k