rtdetr_r50vd_coco_o365

rtdetr_r50vd_coco_o365

PekingU

RT-DETR real-time object detector combining DETR & YOLO capabilities. Achieves 53.1% AP on COCO at 108 FPS. Pre-trained on COCO and Objects365 datasets.

PropertyValue
Parameter Count42M
Model TypeReal-Time Detection Transformer
LicenseApache-2.0
Paperarxiv:2304.08069
Performance55.3% AP on COCO

What is rtdetr_r50vd_coco_o365?

RT-DETR is a groundbreaking real-time object detection model that combines the efficiency of YOLO with the end-to-end capabilities of DETR. Developed by PekingU, it achieves state-of-the-art performance while maintaining real-time inference speeds. The model uses a ResNet-50 backbone and has been pre-trained on both COCO and Objects365 datasets.

Implementation Details

The model implements an efficient hybrid encoder architecture that processes multi-scale features by separating intra-scale interaction and cross-scale fusion. It operates at 640x640 pixel resolution and uses an uncertainty-minimal query selection system to optimize detection accuracy.

  • Efficient hybrid encoder for fast multi-scale feature processing
  • Uncertainty-minimal query selection for improved accuracy
  • Flexible speed tuning through adjustable decoder layers
  • ResNet-50 backbone with 42M parameters

Core Capabilities

  • Real-time object detection at 108 FPS on T4 GPU
  • 55.3% AP on COCO dataset (with Objects365 pre-training)
  • Excellent performance across different object scales (APS: 37.9%, APM: 59.9%, APL: 71.8%)
  • No need for Non-Maximum Suppression (NMS)

Frequently Asked Questions

Q: What makes this model unique?

RT-DETR is the first real-time end-to-end object detector that eliminates the need for NMS while maintaining high performance. It achieves better accuracy than YOLO models while being significantly faster than traditional DETR approaches.

Q: What are the recommended use cases?

This model is ideal for real-time object detection applications where both speed and accuracy are crucial, such as autonomous driving, surveillance systems, and real-time video analysis. Its flexible speed tuning makes it adaptable to various hardware configurations.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026