yolos-small

yolos-small

hustvl

YOLOS-small: A 30.7M parameter Vision Transformer for object detection, achieving 36.1 AP on COCO. Built by hustvl with Apache 2.0 license.

PropertyValue
Parameter Count30.7M
LicenseApache 2.0
PaperView Paper
Performance36.1 AP on COCO

What is yolos-small?

YOLOS-small is a compact Vision Transformer (ViT) model designed specifically for object detection tasks. Developed by hustvl, it represents a simplified approach to transformer-based object detection, achieving impressive results while maintaining a relatively small parameter count of 30.7M.

Implementation Details

The model employs a bipartite matching loss system and processes images through a transformer architecture. It handles 100 object queries simultaneously and uses the Hungarian matching algorithm to optimize object detection. The model was pre-trained on ImageNet-1k for 200 epochs and fine-tuned on COCO 2017 for 150 epochs.

  • Utilizes PyTorch framework for implementation
  • Supports F32 tensor operations
  • Implements DETR-style loss function
  • Combines L1 and generalized IoU loss for bounding boxes

Core Capabilities

  • Object detection with state-of-the-art accuracy
  • Processing of multiple object queries simultaneously
  • Efficient feature extraction from images
  • Real-time bounding box prediction
  • COCO class classification

Frequently Asked Questions

Q: What makes this model unique?

YOLOS-small stands out for its simplicity and efficiency, achieving 36.1 AP on COCO validation while using a pure transformer-based architecture, eliminating the need for complex detection frameworks like Faster R-CNN.

Q: What are the recommended use cases?

The model is ideal for object detection tasks in real-world scenarios, particularly when working with the COCO dataset's object classes. It's especially suitable for applications requiring a good balance between model size and performance.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026