查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于单目视觉的自动驾驶3D目标检测算法研究
姓名：	颜唯佳
学号：	21208223047
保密级别：	公开
论文语种：	chi
学科代码：	085400
学科名称：	工学 - 电子信息
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2024
培养单位：	西安科技大学
院系：	计算机科学与技术学院
专业：	软件工程
研究方向：	图像处理
第一导师姓名：	厍向阳
第一导师单位：	西安科技大学
论文提交日期：	2024-06-14
论文答辩日期：	2024-05-30
论文外文题名：	Research on 3D Object Detection Algorithm for Autonomous Driving Based on Monocular Vision
论文中文关键词：	自动驾驶 ; 3D目标检测 ; 单目视觉 ; 低光增强
论文外文关键词：	Autonomous driving ; 3D object detection ; monocular vision ; low light enhancement
论文中文摘要：	︿ 3D目标检测作为自动驾驶感知系统的核心技术，通过深入解析车辆周围环境，为后续的规划决策和执行阶段提供关键的数据支持。基于单目视觉的3D目标检测算法具有成本低和易落地部署等优势，但是在目标被遮挡、光照不足等环境下，对目标的全面检测仍存在限制。基于此，本文对单目3D目标检测算法进行深入研究，主要研究工作如下：（1）针对现有单目3D目标检测算法存在遮挡导致漏检、多尺度目标检测效果不佳的问题，提出了一种基于Contextual Transformer的单目3D目标检测算法（CM-RTM3D）。首先，在ResNet-50网络中引入Contextual Transformer，构建ResNet-Transformer架构以提取特征。其次，设计多尺度空间感知模块，通过尺度空间响应操作改善浅层特征的丢失情况，嵌入沿水平和竖直两个空间方向的坐标注意力机制学习精细特征，并使用softmax函数生成各尺度的重要性软权重。最后，在偏移损失中采用Huber损失函数提高对遮挡等异常情况的敏感度。实验结果表明，相较于RTM3D算法，所提算法在简单、中等、困难三个难度级别下，AP_3D 分别提升了4.84%、3.82%、5.36%，AP_BEV分别提升了4.75%、6.26%、3.56%。（2）针对夜间低光条件下目标特征提取困难导致检测准确率较低的问题，提出了一种基于改进IAT低光增强的夜间3D目标检测算法。首先，在局部分支中设计空间像素增强网络，通过空间选择模块和空间融合模块动态调整空间感受野，以处理不同亮度和色彩分布下的图像。其次，在全局分支中构建频谱层，捕获初始层的相关特征和图像不同频率成分。最后，设计联合损失函数优化网络，采用MS-SSIM损失和L₁损失共同增强图像的细节和亮度，并引入颜色恒常损失控制色调，提高图像的增强效果。实验结果表明，改进的IAT低光增强算法在图像结构信息和纹理保持上均具有更佳的表现；与CM-RTM3D算法融合后，所提算法的AP_3D和AP_BEV分别提升了5.94%和4.59%，改善了单目3D目标检测在低光环境中的性能。（3）在上述研究内容的基础上，设计并实现基于单目视觉的自动驾驶交通预警系统。该系统能够实时检测道路交通情况，并对危险事件及时发出预警，将检测到的预警信息上传到基于Web开发的信息管理系统中，实时动态分析行驶车辆周围的情况，降低交通事故的发生率。﹀
论文外文摘要：	︿ 3D object detection, as the core technology of automatic driving perception system, provides key data support for the subsequent planning decision-making and execution phases through in-depth parsing of the vehicle surroundings. 3D object detection algorithms based on monocular vision have the advantages of low cost and easy ground deployment, but there are still limitations on the comprehensive detection of targets in environments such as occluded targets and insufficient light. Therefore, this paper carries out an in-depth study on monocular 3D object detection algorithms, and the main research work is as follows: (1) Aiming at the existing monocular 3D object detection algorithm's problems of occlusion leading to missed detection and poor multi-scale object detection, a monocular 3D object detection algorithm based on Contextual Transformer (CM-RTM3D) is proposed. First, Contextual Transformer is introduced into the ResNet-50 network and the ResNet-Transformer architecture is constructed to extract features. Second, a multi-scale spatial perception module is designed to improve the loss of shallow features by scale-space response operation, embed the coordinate attention mechanism along two spatial directions, horizontal and vertical, to learn the fine features, and use the softmax function to generate the soft weights of importance at each scale. Finally, the Huber loss function is used in the offset loss to improve the sensitivity to anomalies such as occlusion. The experimental results show that compared with the RTM3D algorithm, the proposed algorithm improves AP_3D by 4.84%, 3.82%, and 5.36%, and AP_BEV by 4.75%, 6.26%, and 3.56% under the difficulty levels of easy, medium, and difficult, respectively. (2) Aiming at the problem of low detection accuracy due to the difficulty of target feature extraction in low-light conditions at night, a 3D object detection algorithm based on improved IAT low-light enhancement at night is proposed. First, a spatial pixel enhancement network is designed in the local branch, and the spatial sensing field is dynamically adjusted by the spatial selection module and the spatial fusion module to deal with images under different brightness and color distributions. Second, the spectral layer is constructed in the global branch to capture the relevant features of the initial layer and different frequency components of the image. Finally, the joint loss function optimization network is designed to enhance the details and brightness of the image by using MS-SSIM loss and L₁ loss together, and the color constancy loss is introduced to control the hue to improve the enhancement of the image. The experimental results show that the improved IAT low-light enhancement algorithm has better performance in both image structural information and texture preservation; after fusion with the CM-RTM3D algorithm, the proposed algorithm's AP_3D and AP_BEVare enhanced by 5.94% and 4.59%, respectively, which improves the performance of monocular 3D object detection in low-light environments. (3) On the basis of the above research content, the monocular vision-based automatic driving traffic warning system is designed and implemented. The system is able to detect the road traffic situation in real time and issue timely warnings for dangerous events, upload the detected warning information to the information management system developed based on the Web, dynamically analyze the situation around the driving vehicle in real time, and reduce the incidence of traffic accidents. ﹀
中图分类号：	TP391
开放日期：	2024-06-14

附件下载