rtdetr_r50vd_coco_o365

Maintained By
PekingU

RT-DETR R50VD COCO O365

PropertyValue
Parameter Count42M
Model TypeReal-Time Detection Transformer
LicenseApache-2.0
Paperarxiv:2304.08069
Performance55.3% AP on COCO

What is rtdetr_r50vd_coco_o365?

RT-DETR is a groundbreaking real-time object detection model that combines the efficiency of YOLO with the end-to-end capabilities of DETR. Developed by PekingU, it achieves state-of-the-art performance while maintaining real-time inference speeds. The model uses a ResNet-50 backbone and has been pre-trained on both COCO and Objects365 datasets.

Implementation Details

The model implements an efficient hybrid encoder architecture that processes multi-scale features by separating intra-scale interaction and cross-scale fusion. It operates at 640x640 pixel resolution and uses an uncertainty-minimal query selection system to optimize detection accuracy.

  • Efficient hybrid encoder for fast multi-scale feature processing
  • Uncertainty-minimal query selection for improved accuracy
  • Flexible speed tuning through adjustable decoder layers
  • ResNet-50 backbone with 42M parameters

Core Capabilities

  • Real-time object detection at 108 FPS on T4 GPU
  • 55.3% AP on COCO dataset (with Objects365 pre-training)
  • Excellent performance across different object scales (APS: 37.9%, APM: 59.9%, APL: 71.8%)
  • No need for Non-Maximum Suppression (NMS)

Frequently Asked Questions

Q: What makes this model unique?

RT-DETR is the first real-time end-to-end object detector that eliminates the need for NMS while maintaining high performance. It achieves better accuracy than YOLO models while being significantly faster than traditional DETR approaches.

Q: What are the recommended use cases?

This model is ideal for real-time object detection applications where both speed and accuracy are crucial, such as autonomous driving, surveillance systems, and real-time video analysis. Its flexible speed tuning makes it adaptable to various hardware configurations.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.