inception_v3.gluon_in1k

timm

Inception-v3 image classification model with 23.9M params, trained on ImageNet-1k. Features 299x299 input size with 5.7 GMACs compute.

Property	Value
Parameter Count	23.9M
License	Apache-2.0
Framework	PyTorch (timm)
Input Size	299x299
GMACs	5.7
Paper	View Paper

What is inception_v3.gluon_in1k?

The inception_v3.gluon_in1k is a sophisticated image classification model based on the Inception V3 architecture, trained on the ImageNet-1k dataset by MxNet GLUON authors. This implementation represents a significant evolution in convolutional neural network design, featuring an optimized architecture that balances computational efficiency with model performance.

Implementation Details

This model implementation features 23.8M parameters and requires 5.7 GMACs for inference. It processes images at 299x299 resolution and utilizes advanced architectural innovations from the Inception family. The model supports various operational modes including classification, feature extraction, and embedding generation.

Supports both F32 tensor operations
Includes feature map extraction capabilities with multiple output scales
Offers flexible embedding generation options
Provides pre-trained weights optimized on ImageNet-1k

Core Capabilities

Image Classification: Primary task with 1000-class ImageNet categories
Feature Extraction: Multiple scale feature maps (64 to 2048 channels)
Embedding Generation: 2048-dimensional image embeddings
Transfer Learning: Pre-trained weights for domain adaptation

Frequently Asked Questions

Q: What makes this model unique?

This implementation stands out for its GLUON-trained weights and compatibility with the timm library, offering excellent balance between accuracy and computational requirements while maintaining the robust architecture of Inception V3.

Q: What are the recommended use cases?

The model excels in general image classification tasks, feature extraction for downstream tasks, and as a backbone for transfer learning applications. It's particularly well-suited for scenarios requiring 299x299 input resolution and where computational efficiency is important.