论文中文题名: | 洗煤厂环境下的计算机视觉 行人定位技术研究 |
姓名: | |
学号: | 20207223106 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 085400 |
学科名称: | 工学 - 电子信息 |
学生类型: | 硕士 |
学位级别: | 工程硕士 |
学位年度: | 2023 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 计算机视觉 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2023-06-15 |
论文答辩日期: | 2023-05-30 |
论文外文题名: | Research on Computer Vision Pedestrian Localization Technology in Coal Washing Plant Environment |
论文中文关键词: | |
论文外文关键词: | Deep learning ; Target detection ; YOLOv5s ; Visual positioning |
论文中文摘要: |
洗煤厂中的监控摄像头只能提供监控功能,不能提取视频图像中的行人在空间中的位置信息,采用深度学习和计算机视觉的方法可以获取行人所处的位置信息,在发生紧急情况时,可以迅速提供准确的位置信息并发出警报。本文对洗煤厂环境下的计算机视觉行人定位技术的研究工作,主要从行人目标检测和单目视觉定位两个方面展开。具体工作内容如下: 在行人检测部分,针对洗煤厂环境存在复杂背景干扰导致行人检测精度低的问题,本文在 YOLOv5s 检测算法的基础上改进了两个方面:首先,在骨干特征提取网络中融入了通道注意力机制 ECA 模块,增强了检测网络对行人目标局部特征的提取能力;其次,引入了新的边界框回归损失函数 SIOU,使模型的性能有一定的提升。实验结果表明,本文对 YOLOv5s 检测算法的改进在召回率和 mAP 值上分别提升了 2.21%和 2.13%。 在单目视觉定位部分,本文构建并且优化了坐标映射定位模型。采用了激光雷达构建三维点云,通过张正友标定法和辅助点的方法,获取相机的内参和外参。基于计算机视觉的方法建立摄像头监控区域与三维点云监控区域的坐标映射定位模型,实现了二维图像像素任意一点在三维点云空间中的定位。针对单目视觉定位精度低的问题,本文采用改进的粒子群算法优化了坐标映射定位模型的外参。实验结果表明,静态测试点的平均绝对误差(MAE)由 58.64cm 降低至 15.88cm,均方根误差(RMSE)由 63.41cm 降低至 18.29cm。 通过行人目标检测获得的像素坐标输入优化后的坐标映射定位模型,计算行人在三维点云中的坐标信息,实现了洗煤厂环境下的计算机视觉行人定位。实验结果表明,行人动态的 MAE 由 61.84cm 降低至 20.20cm,RMSE 由 69.60cm 降低至 22.06cm。本文研究的洗煤厂环境下的计算机视觉行人定位算法能够实现行人在三维点云空间中的三维可视化定位,因此研究洗煤厂环境下的计算机视觉行人定位技术具有非常重要的意义。 |
论文外文摘要: |
The surveillance cameras in coal washing plants can only provide monitoring functions and cannot extract the spatial location information of pedestrians in video images. By using deep learning and computer vision methods, the location information of pedestrians can be obtained. In case of emergency, accurate location information can be quickly provided and an alarm can be issued. This thesis focuses on the research of computer vision pedestrian positioning technology in the environment of coal washing plants, mainly from two aspects: pedestrian object detection and monocular visual positioning. The specific work content is as follows: In the pedestrian detection section, in response to the problem of low pedestrian detection accuracy caused by complex background interference in the coal washing plant environment, this thesis improves two aspects on the YOLOv5s detection algorithm: firstly, the attention mechanism ECA module is integrated into the backbone feature extraction network to enhance the detection network's ability to extract local features of pedestrian targets; Secondly, a new boundary box regression loss function SIOU is introduced to improve the performance of the model. The experimental results show that the improvement of the YOLOv5s detection algorithm in this thesis has improved the recall rate and mAP value by 2.21% and 2.13%, respectively. In the monocular visual positioning section, this thesis constructs and optimizes a coordinate mapping positioning model. We used LiDAR to construct a 3D point cloud, and obtained the camera's internal and external parameters through Zhang Zhengyou's calibration method and auxiliary point method. Based on computer vision, a coordinate mapping positioning model is established between the camera monitoring area and the 3D point cloud monitoring area, achieving the positioning of any point in the 2D image pixel in the 3D point cloud space. In response to the problem of low positioning accuracy in monocular vision, this paper uses an improved particle swarm optimization algorithm to optimize the external parameters of the coordinate mapping positioning model. The experimental results show that the mean absolute error (MAE) of the static test points is reduced from 58.64cm to 15.88cm, and the root mean square error (RMSE) is reduced from 63.41cm to 18.29cm. The pixel coordinates obtained through pedestrian target detection were ultimately inputted into the optimized coordinate mapping positioning model, and the coordinate information of pedestrians in the 3D point cloud was calculated, achieving computer vision pedestrian positioning in the coal washing plant environment. The experimental results showed that the MAE of pedestrian dynamics decreased from 61.84cm to 20.20cm, and the RMSE decreased from 69.60cm to 22.06cm. The computer vision pedestrian positioning algorithm studied in this thesis can achieve three-dimensional visual positioning of pedestrians in the three-dimensional point cloud space. Therefore, studying the computer vision pedestrian positioning technology in the coal washing plant environment is of great significance. |
参考文献: |
[1] 张瑶,卢焕章,张路平,等.基于深度学习的视觉多目标跟踪算法综述[J].计算机工程与应用,2021,57(13):55-66. [2] 耿艺宁,刘帅师,刘泰廷,等.基于计算机视觉的行人检测技术综述[J].计算机应 [5] 丁国绅,乔延利,易维宁,等.基于光谱图像空间的改进 SIFT 特征提取与匹配[J].北京理工大学学报,2022,42(02):192-199. [8] 牛为华,殷苗苗.基于改进 YOLOv5 的道路小目标检测算法[J].传感技术学报, 2023,36 (01):36-44. [11]赵琬婷,李旭,董轩,等.基于超分辨率特征的小尺度行人检测网络研究[J].传感器与微系统,2022,41(06):56-60. [20]Redmon J, Farhadi A. YOLOv3: An Incremental Improvement[C]//Conference on Computer Vision and Pattern Recognition, 2018, 89-95. [25]谢良波,李宇洋,王勇,等.基于自适应蝙蝠算法的室内 RFID 定位算法[J].通信学报, [26]杨保,张鹏飞,李军杰,等.一种基于蓝牙的室内定位导航技术[J].测绘科学, 2019, 44 (06): 89-95. [28]靳超,邱冬炜.基于 WiFi 信号室内定位技术的研究[J].测绘通报,2017(05):21-25. [30]陈国良,张言哲,汪云甲,等.WiFi-PDR 室内组合定位的无迹卡尔曼滤波算法[J].测绘学报,2015,44(12):1314-1321. [31]李星云,李众立,廖晓波.基于单目视觉的工业机器人定位系统的设计[J].机床与液压,2015,43(09):35-38. [33]朱永丰,朱述龙,张静静,等.基于 ORB 特征的单目视觉定位算法研究[J].计算机科 [35]邓晖,邓逸川,欧智斌,等.单目视觉技术在室内定位中的应用研究[J].测绘工 [36]聂伟,文怀志,谢良波,等.一种基于单目视觉的无人机室内定位方法[J].电子与信息学报,2022,44(3):906-914. [37]王勇,陈国良,李晓园,等.一种相机标定辅助的单目视觉室内定位方法[J].测绘通报, [38]张星,刘涛,孙龙培,等.一种视觉与惯性协同的室内多行人目标定位方法[J].武汉大学学报(信息科学版),2021,46(05):672-680. [39]冯伟夏,郭建龙,薛江,等.基于 YOLOv3 和坐标映射的变电站作业人员精确立体定位算法研究[J].武汉大学学报(工学版),2022,55(06):617-622. [48]秦瑞康,杨月全,李福东,等.基于全参数自适应变异粒子群算法的单目相机标定[J].东南大学学报(自然科学版),2017,47(S1):193-198. |
中图分类号: | TP391 |
开放日期: | 2023-06-15 |