摘 要: 3D目标检测是自动驾驶技术的基础,融合激光点云和图像等模态的信息可以有效地提高目标检测的准确性和鲁棒性。文章改进了现有融合激光点云和图像等模态信息的3D目标检测网络,提出了一种新的并行融合模块,用于同时维护两种模态下的特征信息,减少信息损失。此外,利用掩码特征增强模块,提高受遮挡物体的检测能力。在KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute)数据集上进行了验证,实验结果表明,相比于基准网络,该方法有效提高了3D目标检测的性能,平均精度达到了77.41%,同时优于目前大多数的先进方法。 |
关键词: 3D目标检测;多模态融合;可变形注意力机制 |
中图分类号: TP391
文献标识码: A
|
基金项目: 国家重点研发计划资助(2020YFB1600702) |
|
Multimodal 3D Object Detection Algorithm Based on Deformable Attention Mechanisms |
HAN Bangyan, TIAN Qing
|
(North China University of Technology, Beijing 100144, China)
hanbangyan@163.com; tianqing@ncut.edu.cn
|
Abstract: 3D object detection is fundamental to autonomous driving technology. Integrating information from modalities such as laser point clouds and images can effectively enhance the accuracy and robustness of object detection. This paper improves existing 3D object detection networks that fuse laser point clouds and images by proposing a new parallel fusion module that simultaneously maintains feature information from both modalities, thereby reducing information loss. Additionally, a mask feature enhancement module is utilized to improve the detection capabilities of occluded objects. Validation is performed on the KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) dataset. Experimental results demonstrate that compared to baseline networks, this proposed method significantly enhances 3D object detection performance, achieving an average precision of 77. 41% , and outperforms most current advanced methods. |
Keywords: 3D object detection; multimodal fusion; deformable attention mechanism |