| 摘 要: 针对复杂场景下车辆行人检测效果不佳的问题,提出了一种改进RT-DETR(Real-Time Detection Transformer)模型对复杂环境下的车辆和行人进行定位。通过可学习的位置编码增强模型对空间关系和目标位置的感知能力,引入可变形卷积DCN(Deformable Convolution)以更好地捕获尺度变化复杂的目标特征,并在特征融合阶段采用LSK(Large Selection Kernel)BLOCK引导跨尺度特征的高效融合。实验结果显示,提出的模型在KITTI数据集上的mAP@0.5指标达到96.1%,较基准模型提升了2.8%,验证了该改进模型在提升车辆行人检测任务性能方面的有效性。 |
| 关键词: 自动驾驶 RT-DETR 可学习的位置编码 可变形卷积 LSKBLOCK |
|
中图分类号:
文献标识码: A
|
| 基金项目: 基于有限元分析的交叉韧带生物力学特性与膝骨关节炎致病机理的研究(黔科合基础-ZK[2023]一般052) |
|
| Vehicle and Pedestrian Detection Algorithm Based on Improved RT-DETR |
|
GU Shuotian1, WU Jiang1, WANG Liang2
|
(1.School of Information Science and Engineering, Zhejiang Sc-i Tech University, Hangzhou 310008, China; 2.Zhejiang Shannon Communication Technology Co., Ltd., Hangzhou 310011, China)
576967638@qq.com; wujiang@zstu.edu.cn; 13401157573@139.com
|
| Abstract: To address the suboptimal detection performance of vehicles and pedestrians in complex scenarios, this paper proposes an improved RT-DETR (Rea-l Time Detection Transformer)model for localizing vehicles and pedestrians in challenging environments. The model enhances spatial relationship and target localization perception through learnable position encoding, incorporates Deformable Convolution (DCN) to better capture features of targets with complex scale variations, and employs Large Selection Kernel(LSK)BLOCK during feature fusion to guide efficient cross-scale feature integration. Experimental results demonstrate that the proposed model achieves 96.1% mAP@0.5 on the KITTI dataset, representing a 2.8% improvement against the baseline model. This validates the effectiveness of the enhanced model in advancing vehicle and pedestrian detection performance. |
| Keywords: autonomous driving RT-DETR learnable position encoding deformable convolution LSK BLOCK |