RT-DETR R50VD COCO O365
Property | Value |
---|---|
Parameter Count | 42M |
Model Type | Real-Time Detection Transformer |
License | Apache-2.0 |
Paper | arxiv:2304.08069 |
Performance | 55.3% AP on COCO |
What is rtdetr_r50vd_coco_o365?
RT-DETR is a groundbreaking real-time object detection model that combines the efficiency of YOLO with the end-to-end capabilities of DETR. Developed by PekingU, it achieves state-of-the-art performance while maintaining real-time inference speeds. The model uses a ResNet-50 backbone and has been pre-trained on both COCO and Objects365 datasets.
Implementation Details
The model implements an efficient hybrid encoder architecture that processes multi-scale features by separating intra-scale interaction and cross-scale fusion. It operates at 640x640 pixel resolution and uses an uncertainty-minimal query selection system to optimize detection accuracy.
- Efficient hybrid encoder for fast multi-scale feature processing
- Uncertainty-minimal query selection for improved accuracy
- Flexible speed tuning through adjustable decoder layers
- ResNet-50 backbone with 42M parameters
Core Capabilities
- Real-time object detection at 108 FPS on T4 GPU
- 55.3% AP on COCO dataset (with Objects365 pre-training)
- Excellent performance across different object scales (APS: 37.9%, APM: 59.9%, APL: 71.8%)
- No need for Non-Maximum Suppression (NMS)
Frequently Asked Questions
Q: What makes this model unique?
RT-DETR is the first real-time end-to-end object detector that eliminates the need for NMS while maintaining high performance. It achieves better accuracy than YOLO models while being significantly faster than traditional DETR approaches.
Q: What are the recommended use cases?
This model is ideal for real-time object detection applications where both speed and accuracy are crucial, such as autonomous driving, surveillance systems, and real-time video analysis. Its flexible speed tuning makes it adaptable to various hardware configurations.