查看论文信息

免费浏览

查看论文信息

论文中文题名：	城市道路环境下无人车语义SLAM技术研究
姓名：	雷磊
学号：	20205016028
保密级别：	保密（1年后开放）
论文语种：	chi
学科代码：	080201
学科名称：	工学 - 机械工程 - 机械制造及其自动化
学生类型：	硕士
学位级别：	工学硕士
学位年度：	2023
培养单位：	西安科技大学
院系：	机械工程学院
专业：	机械工程
研究方向：	智能车辆
第一导师姓名：	张传伟
第一导师单位：	西安科技大学
论文提交日期：	2023-06-14
论文答辩日期：	2023-05-29
论文外文题名：	Research on Semantic SLAM Technology of Unmanned Vehicle in Urban Road Environment
论文中文关键词：	无人车 ; 语义SLAM ; 多传感器融合 ; 语义分割
论文外文关键词：	Unmanned car ; Semantic simultaneous localization and mapping ; Multi-sensor fusion ; Semantic segmentation
论文中文摘要：	︿同时定位与地图构建（Simultaneous Localization and Mapping，SLAM）是城市道路实现无人驾驶的关键技术，城市道路环境具有丰富的特征，有利于无人车的自主定位，但环境中的动态物体会影响定位精度与地图构建。本文采用相机、激光雷达和惯性测量单元（Inertial Measurement Unit，IMU）相结合的方法，通过语义分割检测并剔除环境中的动态物体，实现动态场景下无人车的定位与地图构建。主要研究内容如下：（1）研究单目相机、激光雷达和IMU的传感器模型，针对研究场景选取合适的传感器模型。同时采用张氏标定法对相机内参和畸变系数进行标定，使用手眼标定对激光雷达和IMU的外参进行标定，通过标定工具对激光雷达和相机的外参进行标定。（2）基于LOAM算法设计一种激光雷达惯性SLAM系统框架，利用IMU预积分消除点云运动畸变，采用点云地面分割和点云聚类剔除噪声点的方法降低计算复杂度，采用因子图后端优化运动轨迹，将局部地图增量匹配到全局地图。在KITTI数据集中对算法进行测试，与LOAM算法和LeGo_LOAM算法进行对比，实验结果表明所设计的激光雷达惯性SLAM系统框架具有更高的精度与鲁棒性。（3）为提高图像语义分割算法性能，以更适用于大尺度环境三维语义地图构建的问题，提出一种改进的图像分割算法，通过替换主干网络实现参数更少和速度更快的图像分割，添加注意力模块增强模型的性能。时空间同步处理保证了多传感器在关键帧采集环境信息的一致性，同时通过激光雷达点云和相机像素点的映射关系，实现单帧点云的语义分割。采用基于面元模型的几何空间一致性对动态障碍物检测与剔除，将图像语义分割添加至激光雷达惯性SLAM系统中，建立三维语义SLAM系统框架。在校园环境和城市道路环境对激光雷达惯性SLAM系统和三维语义SLAM系统进行实验验证，结果表明LI_Odom算法相较于LOAM算法定位误差率分别降低了0.88%和0.92%，相较于LeGo_LOAM算法定位误差率分别降低了0.79%和0.62%，LIS_SLAM算法相较于LI_Odom算法定位误差率分别降低了0.31%和0.42%。﹀
论文外文摘要：	︿ Simultaneous Localization and Mapping (SLAM) is a key technology for urban roads to achieve unmanned driving. The urban road environment is rich in features, which is conducive to the autonomous positioning of drones, but the dynamic objects in the environment will affect the positioning accuracy and map construction. This article adopts a combination of cameras, lidar, and inertia measurement units (IMU). It can detect and eliminate dynamic objects in the environment through semantic segmentation detection and eliminating dynamic objects in the environment. The main research contents are as follows: (1) Researched the sensor models of monocular cameras, lidar and IMU, and selected the appropriate sensor model for the research scene. At the same time, the Zhang's calibration method is used to calibrate the internal parameters and distortion coefficients of the camera, hand-eye calibration is used to calibrate the lidar and IMU's external parameters, and the calibration tool is used to calibrate the lidar and camera's external parameters. (2) Based on the LOAM algorithm, designed a lidar inertia SLAM system framework. Using IMU pre-accumulation to eliminate point cloud movement distortion, using point cloud ground segmentation and point cloud clustering methods to reduce the computing complexity, using the factor diagram back end to optimize the movement trajectory, and matching the local map incremental to the global map. The algorithm was tested in the KITTI dataset. Compared with the LOAM algorithm and the LeGo_LOAM algorithm, the experimental results show that the designed lidar inertia SLAM system framework has higher accuracy and robustness. (3) In order to improve the performance of the image semantic segmentation algorithm, so as to be more suitable for the construction of 3D semantic maps in large-scale environments, an improved image segmentation algorithm is proposed, which achieves image segmentation with fewer parameters and faster speed by replacing the backbone network. Besides, attention modules were added to enhance the performance of the model. The time-space synchronization processing ensures the consistency of environmental information collected by multiple sensors in key frames. At the same time, the semantic segmentation of single-frame point clouds is realized through the mapping relationship between lidar point clouds and camera pixels. The geometric space consistency based on the surfel model is used to detect and eliminate dynamic obstacles, and image semantic segmentation is added to the lidar inertial SLAM system to establish a three-dimensional semantic SLAM system framework. The lidar inertial SLAM system and the three-dimensional semantic SLAM system were verified experimentally in the campus environment and the urban road environment. The results showed that the positioning error rate of the LI_Odom algorithm was 0.88% and 0.92% lower than that of the LOAM algorithm, respectively. Compared with the LeGo_LOAM algorithm, the positioning error rate was respectively Compared with LI_Odom algorithm, the positioning error rate of LIS_SLAM algorithm is reduced by 0.31% and 0.42%, respectively. ﹀
参考文献：	︿ [1]陈文华. 汽车无人驾驶技术的发展现状与研究趋势[J]. 科技创新导报, 2019, 16(11): 240+3. [2]于雷. 基于数据挖掘的道路交通事故分析及预防对策研究[D]; 重庆: 重庆交通大学, 2022. [3]Waxman A, LeMoigne J, Davis L, et al. A visual navigation system for autonomous land vehicles[J]. IEEE Journal on Robotics and Automation, 1987, 3(2): 124-141. [4]Bejerano G, Robinette P, Yanco H A, et al. Back to the future: Opinions of autonomous cars over time[C]//Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction. 2021: 157-161. [5]姜允侃.无人驾驶汽车的发展现状及展望[J]. 微型电脑应用,2019,35(05):60-64. [6]张朋飞,何克忠,欧阳正柱等. 多功能室外智能移动机器人实验平台—THMR-V[J]. 机器人,2002,(02):97-10 [7]孙振平,安向京,贺汉根. CITAVT-IV——视觉导航的自主车[J]. 机器人,2002,(02):115-121. [8]Thrun S, Montemerlo M, Koller D, et al. Fastslam: An efficient solution to the simultaneous localization and mapping problem with unknown data association[J]. Journal of Machine Learning Research, 2004, 4(3): 380-407. [9]Montemerlo M, Thrun S, Koller D, et al. FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges[C]//IJCAI. 2003, 3(2003): 1151-1156. [10]Grisetti G, Stachniss C, Burgard W. Improved techniques for grid mapping with rao-blackwellized particle filters[J]. IEEE transactions on Robotics, 2007, 23(1): 34-46. [11]Schon T, Gustafsson F, Nordlund P J. Marginalized particle filters for mixed linear/nonlinear state-space models[J]. IEEE Transactions on signal processing, 2005, 53(7): 2279-2289. [12]Konolige K, Grisetti G, Kümmerle R, et al. Efficient sparse pose adjustment for 2D mapping[C]//2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2010: 22-29. [13]Kohlbrecher S, Von Stryk O, Meyer J, et al. A flexible and scalable SLAM system with full 3D motion estimation[C]//2011 IEEE international symposium on safety, security, and rescue robotics. IEEE, 2011: 155-160. [14]Hess W, Kohler D, Rapp H, et al. Real-time loop closure in 2D LIDAR SLAM[C]//2016 IEEE international conference on robotics and automation (ICRA). IEEE, 2016: 1271-1278. [15]Zhang J, Singh S. Low-drift and real-time lidar odometry and mapping[J]. Autonomous Robots, 2017, 41: 401-416. [16]Shan T, Englot B. Lego-loam: Lightweight and ground-optimized lidar odometry and mapping on variable terrain[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018: 4758-4765. [17]Ye H, Chen Y, Liu M. Tightly coupled 3d lidar inertial odometry and mapping[C]//2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019: 3144-3150. [18]Shan T, Englot B, Meyers D, et al. Lio-sam: Tightly-coupled lidar inertial odometry via smoothing and mapping[C]//2020 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2020: 5135-5142. [19]Zhao S, Fang Z, Li H L, et al. A robust laser-inertial odometry and mapping method for large-scale highway environments[C]//2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019: 1285-1292. [20]晏小彬, 彭道刚, 戚尔江. 基于地平面的单目视觉辅助激光雷达SLAM研究[J]. 光学学报, 2020, 40(24): 173-83. [21]王忠立, 李文仪. 基于点云分割的运动目标跟踪与SLAM方法[J]. 机器人, 2021, 43(02): 177-92. [22]张福斌, 王凯, 廖伟飞等. 激光雷达/MEMS IMU/里程计紧组合导航算法[J]. 仪器仪表学报, 2022, 43(07): 139-48. [23]刘振宇, 惠泽宇, 郭旭等. 基于滑动窗口优化的激光雷达惯性测量单元紧耦合同时定位与建图算法[J]. 科学技术与工程, 2022, 22(21): 9167-75. [24]焦嵩鸣, 姚鑫, 丁辉等. 适应于环境空间变化的激光雷达SLAM建图方法[J]. 系统仿真学报: 1-11. [25]蔡英凤, 陆子恒, 李祎承等. 基于多传感器融合的紧耦合SLAM系统[J]. 汽车工程, 2022, 44(03): 350-61. [26]Otsu N. A threshold selection method from gray-level histograms[J]. IEEE transactions on systems, man, and cybernetics, 1979, 9(1): 62-66. [27]He Y, Jie L, Dehong Y, et al. An improved algorithm of the maximum entropy image segmentation[C]//2014 Fifth International Conference on Intelligent Systems Design and Engineering Applications. IEEE, 2014: 157-160. [28]Kim C H, Lee Y J. Medical image segmentation by improved 3D adaptive thresholding[C]//2015 International Conference on Information and Communication Technology Convergence (ICTC). IEEE, 2015: 263-265. [29]Li X, Jing L, Lin Q, et al. A new region growing-based segmentation method for high resolution remote sensing imagery[C]//2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). IEEE, 2015: 4328-4331. [30]Faruquzzaman A B M, Paiker N R, Arafat J, et al. Object segmentation based on split and merge algorithm[C]//TENCON 2008-2008 IEEE Region 10 Conference. IEEE, 2008: 1-4. [31]Zhang H, Zhu Q, Guan X. Probe into image segmentation based on Sobel operator and maximum entropy algorithm[C]//2012 International Conference on Computer Science and Service System. IEEE, 2012: 238-241. [32]李浩谊, 马春庭. 基于改进的Scharr算法的海上舰船图像边缘检测[J]. 舰船电子工程, 2019, 39(03): 103-6. [33]Wan Y, Lu T, Yang W, et al. A novel image segmentation algorithm via neighborhood principal component analysis and laplace operator[C]//2015 International Conference on Network and Information Systems for Computers. IEEE, 2015: 273-276. [34]Khrissi L, Akkad N E L, Satori H, et al. Color image segmentation based on hybridization between Canny and k-means[C]//2019 7th Mediterranean Congress of Telecommunications (CMT). IEEE, 2019: 1-4. [35]Ren D, Jia Z, Yang J, et al. A practical grabcut color image segmentation based on bayes classification and simple linear iterative clustering[J]. IEEE Access, 2017, 5: 18480-18487. [36]Li H, Tang Y, Liu Q, et al. A novel multi-resolution segmentation algorithm for highresolution remote sensing imagery based on minimum spanning tree and minimum heterogeneity criterion[C]//2014 IEEE Geoscience and Remote Sensing Symposium. IEEE, 2014: 2850-2853. [37]Ding X, Li X. Coastline detection in SAR images using multiscale normalized cut segmentation[C]//2014 IEEE Geoscience and Remote Sensing Symposium. IEEE, 2014: 4447-4449. [38]Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431-3440. [39]Chen L C, Papandreou G, Kokkinos I, et al. Semantic image segmentation with deep convolutional nets and fully connected crfs[J]. arXiv preprint arXiv:1412.7062, 2014. [40]Liu W, Rabinovich A, Berg A C. Parsenet: Looking wider to see better[J]. arXiv preprint arXiv:1506.04579, 2015. [41]Lin M, Chen Q, Yan S. Network in network[J]. arXiv preprint arXiv:1312.4400, 2013. [42]Visin F, Ciccone M, Romero A, et al. Reseg: A recurrent neural network-based model for semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2016: 41-48. [43]Visin F, Kastner K, Cho K, et al. Renet: A recurrent neural network based alternative to convolutional networks[J]. arXiv preprint arXiv:1505.00393, 2015. [44]Badrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12): 2481-2495. [45]Chen L C, Papandreou G, Kokkinos I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 40(4): 834-848. [46]He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916. [47]Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv preprint arXiv:1706.05587, 2017. [48]Chen L C, Zhu Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 801-818. [49]Li H, Xiong P, An J, et al. Pyramid attention network for semantic segmentation[J]. arXiv preprint arXiv:1805.10180, 2018. [50]Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 5693-5703. [51]刘雨溪, 张铂, 王斌. 基于生成式对抗网络的遥感图像半监督语义分割[J]. 红外与毫米波学报, 2020, 39(04): 473-82. [52]Kundu A, Li Y, Dellaert F, et al. Joint semantic segmentation and 3d reconstruction from monocular video[C]//Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI 13. Springer International Publishing, 2014: 703-718. [53]Vineet V, Miksik O, Lidegaard M, et al. Incremental dense semantic stereo fusion for large-scale semantic scene reconstruction[C]//2015 IEEE international conference on robotics and automation (ICRA). IEEE, 2015: 75-82. [54]Sünderhauf N, Pham T T, Latif Y, et al. Meaningful maps with object-oriented semantic mapping[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2017: 5079-5085. [55]Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer International Publishing, 2016: 21-37.. [56]Chen X, Milioto A, Palazzolo E, et al. Suma++: Efficient lidar-based semantic slam[C]//2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019: 4530-4537. [57]Bojko A, Dupont R, Tamaazousti M, et al. Self-Improving SLAM in Dynamic Environments: Learning When to Mask[J]. arXiv preprint arXiv:2210.08350, 2022. [58]Eslamian A, Ahmadzadeh M R. Det-SLAM: A semantic visual SLAM for highly dynamic scenes using Detectron2[J]. arXiv preprint arXiv:2210.00278, 2022. [59]Campos C, Elvira R, Rodríguez J J G, et al. Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam[J]. IEEE Transactions on Robotics, 2021, 37(6): 1874-1890. [60]Davison A J, Reid I D, Molton N D, et al. MonoSLAM: Real-time single camera SLAM[J]. IEEE transactions on pattern analysis and machine intelligence, 2007, 29(6): 1052-1067. [61]Zhang Z. Flexible camera calibration by viewing a plane from unknown orientations[C]//Proceedings of the seventh ieee international conference on computer vision. Ieee, 1999, 1: 666-673. [62]Tsai R Y, Lenz R K. A new technique for fully autonomous and efficient 3d robotics hand/eye calibration[J]. IEEE Transactions on robotics and automation, 1989, 5(3): 345-358. [63]Unnikrishnan R, Hebert M. Fast extrinsic calibration of a laser rangefinder to a camera[J]. Robotics Institute, Pittsburgh, PA, Tech. Rep. CMU-RI-TR-05-09, 2005. [64]Chollet F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1251-1258. [65]Geiger A, Lenz P, Stiller C, et al. Vision meets robotics: The kitti dataset[J]. The International Journal of Robotics Research, 2013, 32(11): 1231-1237. [66]Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? the kitti vision benchmark suite[C]//2012 IEEE conference on computer vision and pattern recognition. IEEE, 2012: 3354-3361. [67]Sturm J, Engelhard N, Endres F, et al. A benchmark for the evaluation of RGB-D SLAM systems[C]//2012 IEEE/RSJ international conference on intelligent robots and systems. IEEE, 2012: 573-580. [68]Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1492-1500. [69]Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141. [70]Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014. [71]Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 1-9. [72]Cordts M, Omran M, Ramos S, et al. The cityscapes dataset[C]//CVPR Workshop on the Future of Datasets in Vision. sn, 2015, 2. [73]任昕旸. 基于PTP的高精度网络时间同步系统的研究与优化[D]; 北京: 北京邮电大学, 2020. ﹀
中图分类号：	U69.72
开放日期：	2024-06-15

附件下载