查看论文信息

查看全文

免费浏览

查看论文信息

论文中文题名：	复杂环境下的vSLAM闭环检测算法研究
姓名：	张丽娜
学号：	18207205057
保密级别：	公开
论文语种：	chi
学科代码：	085208
学科名称：	工学 - 工程 - 电子与通信工程
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2021
培养单位：	西安科技大学
院系：	通信与信息工程学院
专业：	电子与通信工程
研究方向：	视觉SLAM
第一导师姓名：	朱周华
第一导师单位：	西安科技大学
论文提交日期：	2021-06-17
论文答辩日期：	2021-06-03
论文外文题名：	Research on vSLAM Loop Closure Detection Algorithm in Complex Environment
论文中文关键词：	vSLAM ; 闭环检测 ; 卷积神经网络 ; 感兴趣区域 ; VLAD编码
论文外文关键词：	Visual simultaneous localization and mapping ; Loop closure detection ; Convolutional neural network ; Area of interest ; VLAD coding
论文中文摘要：	︿视觉同步定位与地图构建（Visual Simultaneous Localization and Mapping，vSLAM）是实现移动机器人自主定位和导航的核心技术，已被广泛应用于自动驾驶、智能家居及航空等领域。闭环检测作为vSLAM系统的一个重要模块，它通过识别机器人是否到访之前经过的位置，可以有效地减少累积误差并校正构建的地图。经典的闭环检测算法应用于复杂环境时存在准确率低、耗时长及鲁棒性差等问题，不利于实际场景中的应用。本文就目前vSLAM闭环检测存在的问题提出了以下改进算法。针对传统算法大多基于人工设计特征，对于复杂场景的闭环检测鲁棒性差的问题，本文使用SPED数据集预训练的HybirdNet网络模型进行特征提取，该数据集采集自存在光照、天气、视角等多因素变化的复杂环境。通过对比不同网络层提取特征的性能，选择表现最好的Conv5层进行图像特征的提取，可以有效提高算法的鲁棒性。针对直接使用卷积神经网络提取的特征进行相似度计算会造成图像部分局部空间信息丢失的问题，提出了改进的多尺度注意力学习机制与VLAD特征融合的图像描述方法，即对中间层输出的特征图进行感兴趣区域识别，提取有效的特征，并对有效特征进行VLAD编码，提高特征对图像深度信息的表达能力，达到提高算法准确率的目的。针对基于卷积神经网络的图像特征空间复杂度过高，导致特征匹配耗时长的问题，引入PCA降维方法剔除特征中的冗余信息和噪声，使用余弦距离计算场景特征间的相似性，实现闭环检测。通过与其它具有代表性的算法进行对比实验，结果表明本文算法在Nordland数据集上平均准确率最高，可达93.4%，在Gardens Point三个子数据集上的平均准确率分别为85.2%、88.3%和86.7%，且算法时间性能提高了约27.4%，能够达到vSLAM系统对闭环检测准确率和实时性的要求，同时证明本文算法具有一定的理论创新性和应用价值。﹀
论文外文摘要：	︿ Visual Simultaneous Localization and Mapping (vSLAM for short) is a core technology for autonomous localization and navigation of mobile robots, which has been widely used in the fields of autonomous driving, smart home and aviation. As an important module of vSLAM system, loop closure detection can effectively reduce the cumulative error and correct the constructed map by confirming whether the robot has visited the previous location. The classical loop closure detection algorithm has the problems of low accuracy, long time consumption and poor robustness when applied in complex environment, which is not conducive to the application in actual scenes. In this paper, the following improved algorithm is proposed to solve the existing problems of vSLAM loop closure detection. For the problem that traditional algorithms are mostly based on artificial design features and have poor robustness for loop closure detection of complex scenes, a past-trained HybirdNet network model was used for feature extraction in this paper. The data set was derived from a complex environment with multi-factor changes such as light, weather and perspective. By comparing the performance of feature extraction with different network layers, the best performing Conv5 layer is selected to extract image features, which can effectively improve the robustness of the algorithm. According to the characteristics of the direct use of convolution neural network to extract, similarity calculation will cause the local spatial information leakage problems, part image fusion improved multi-scale attention learning mechanism and characteristics of VLAD image description method, the characteristics of the middle tier output figure is used to identify the interest area, to extract the effective features, finally to VLAD coding of effective features, To improve the expression ability of feature to image depth information, and achieve the purpose of improving the accuracy of the algorithm. In order to solve the problem that the complexity of image feature space based on convolutional neural network is too high, which leads to the time-consuming of feature matching, PCA dimension reduction method is introduced to eliminate the redundant information and noise in features, and cosine distance is used to calculate the similarity between scene features and realize loop closure detection. By comparing with other representative algorithms, the results show that the proposed algorithm has the highest average accuracy of 93.4% on Nordland data set, and the average accuracy of 85.2%, 88.3% and 86.7% on Gardens Point three sub-data sets, respectively. Moreover, the time performance of the proposed algorithm is improved by about 27.4%, which can meet the requirements of vSLAM system for the accuracy and real-time performance of loop closure detection. Meanwhile, it is proved that the proposed algorithm has certain theoretical innovation and application value. ﹀
参考文献：	︿ [1]吴锦辉, 陶友瑞. 工业机器人定位精度可靠性研究现状综述[J]. 中国机械工程, 2020, 31(18): 2180-2188. [2]Vidal A R, Rebecq H, Horstschaefer T, et al. Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed Scenarios[J]. IEEE Robotics & Automation Letters, 2018:994-1001. [3]黄瑞, 张轶. 高适应性激光雷达SLAM[J]. 电子科技大学学报,2021,50(1): 52-58. [4]郭金辉, 陈秀万, 王媛. 视觉惯性SLAM研究进展[J]. 火力与指挥控制, 2021, 46(1): 1-8. [5]张大伟, 苏帅. 自主移动机器人视觉SLAM技术研究[J]. 郑州大学学报（理学版）, 2021, 53(1): 1-8. [6]潘锡英, 何元烈, 孙盛, 等. 基于图像感兴趣区域的机器人闭环检测算法[J]. 机器人, 2019, v.41(05): 119-125. [7]安平, 王国平, 余佳东, 等. 一种高效准确的视觉SLAM闭环检测算法[J]. 北京航空航天大学学报, 2021, 47(1): 24-30. [8]刘瑞军, 王向上, 张晨, 等. 基于深度学习的视觉SLAM综述[J]. 系统仿真学报, 2020, 32(7): 1244-1256. [9]周彦, 李雅芳, 王冬丽, 等. 视觉同时定位与地图创建综述[J]. 智能系统学报, 2018, 13(1): 97-106. [10]Zhao H, Wang Z. Motion measurement using inertial sensors, ultrasonic sensors, and magnetometers with extended kalman filter for data fusion[J]. IEEE Sensors Journal, 2011, 12(5): 943-953. [11]Grisetti G, Tipaldi G D, Stachniss C, et al. Fast and accurate SLAM with Rao–Blackwellized particle filters[J]. Robotics and Autonomous Systems, 2007, 55(1): 30-38. [12]Grisettiyz G, Stachniss C, Burgard W. Improving grid-based slam with rao-blackwellized particle filters by adaptive proposals and selective resampling[C]//Proceedings of the 2005 IEEE international conference on robotics and automation. IEEE, 2005: 2432-2437. [13]张洪华, 刘璇, 陈付豪, 等. 基于图优化的SLAM后端优化研究与发展[J]. 计算机应用研究, 2019, 036(001): 11-17. [14]Mur-Artal R, Montiel J M M, Tardos J D. ORB-SLAM: a versatile and accurate monocular SLAM system[J]. IEEE transactions on robotics, 2015, 31(5): 1147-1163. [15]Mur-Artal R, Tardós J D. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras[J]. IEEE Transactions on Robotics, 2017, 33(5): 1255-1262. [16]He Y, Zhao J, Guo Y, et al. Pl-vio: Tightly-coupled monocular visual–inertial odometry using point and line features[J]. Sensors, 2018, 18(4): 1159. [17]Campos C , Elvira R , Juan J. Gómez Rodríguez, et al. ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM[J]. 2020. [18]叶俊宏. 基于深度学习的视觉SLAM闭环检测方法研究[D]. 四川:电子科技大学, 2020. [19]唐灿, 唐亮贵, 刘波. 图像特征检测与匹配方法研究综述[J]. 南京信息工程大学学报, 2020, 12(3): 261-273. [20]Gálvez-López D, Tardos J D. Bags of binary words for fast place recognition in image sequences[J]. IEEE Transactions on Robotics, 2012, 28(5): 1188-1197. [21]Labbé M, Michaud F. RTAB‐Map as an open‐source lidar and visual simultaneous localization and mapping library for large‐scale and long‐term online operation[J]. Journal of Field Robotics, 2019, 36(2): 416-446. [22]Sunderhauf N, Protzel P. BRIEF-Gist-Closing the loop by simple means[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA: IEEE, 2011: 1234-1241. [23]Okawa M. From BoVW to VLAD with KAZE features: Offline signature verification considering cognitive processes of forensic experts[J]. Pattern Recognition Letters, 2018, 113(OCT.1):75-82. [24]Bai D, Wang C, Bo Z, et al. Sequence Searching with CNN Features for Robust and Fast Visual Place Recognition[J]. Computers & Graphics, 2018, 70(feb.):270-280. [25]Li X, Yang J, Ma J. Large Scale Category-Structured Image Retrieval for Object Identification Through Supervised Learning of CNN and SURF-Based Matching[J]. IEEE Access, 2020, 8:57796-57809. [26]BAI, Dongdong, WANG, et al. CNN Feature Boosted SeqSLAM for Real-Time Loop Closure Detection[J]. Chinese Journal of Electronics, 2018, v.27(03):48-59. [27]Chen Z, Jacobson A, Sünderhauf N, et al. Deep learning features at scale for visual place recognition[C]//2017 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2017: 3223-3230. [28]Debojit B, Su H, Wang C, et al. An Automatic Car Counting System Using OverFeat Framework[J]. Sensors, 2017, 17(7):1535. [29]Gao X, Zhang T. Unsupervised learning to detect loops using deep neural networks for visual SLAM system[J]. Autonomous robots, 2017, 41(1): 1-18. [30]Sünderhauf N, Shirazi S, Jacobson A, et al. Place recognition with convnet landmarks: Viewpoint-robust, condition-robust, training-free[J]. Robotics: Science and Systems XI:, 2015: 1-10. [31]Hou Y, Zhang H, Zhou S. Convolutional neural network-based image representation for visual loop closure detection[C]//2015 IEEE international conference on information and automation. IEEE, 2015: 2238-2245. [32]Arandjelovic R, Gronat P, Torii A, et al. NetVLAD: CNN architecture for weakly supervised place recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 5297-5307. [33]Lopez-Antequera M, Gomez-Ojeda R, Petkov N, et al. Appearance-invariant place recognition by discriminatively training a convolutional neural network[J]. Pattern Recognition Letters, 2017, 92: 89-95. [34]Hou Y, Zhang H, Zhou S. BoCNF: efficient image matching with Bag of ConvNet features for scalable and robust visual place recognition[J]. Autonomous Robots, 2018, 42(6): 1169-1185. [35]Khaliq A , Ehsan S , Chen Z , et al. A Holistic Visual Place Recognition Approach Using Lightweight CNNs for Significant ViewPoint and Appearance Changes[J]. IEEE Transactions on Robotics, 2019, PP(99):1-9. [36]权美香, 朴松昊, 李国. 视觉SLAM综述[J]. 智能系统学报, 2016, 11(6): 768-776. [37]刘强, 段富海, 桑勇, 等. 复杂环境下视觉SLAM闭环检测方法综述[J]. 机器人, 2019, 41(1): 112-123, 136. [38]Vallve J, Sola J, Andrade-Cetto J. Graph SLAM sparsification with populated topologies using factor descent optimization[J]. IEEE Robotics and Automation Letters, 2018, 3(2): 1322-1329. [39]Grave E, Joulin A, Cissé, Moustapha, et al. Efficient softmax approximation for GPUs[J]. Proceedings of the 34th International Conference on Machine Learning,2017, 70: 1302-1310. [40]Russakovsky O, Deng J, Su H, et al. ImageNet Large Scale Visual Recognition Challenge[J]. International Journal of Computer Vision, 2014: 1-42. [41]Alguri K S, Chen C C, Harley J B. Sim-to-Real: Employing ultrasonic guided wave digital surrogates and transfer learning for damage visualization[J]. Ultrasonics, 2021, 111(1851): 106338. [42]Zhou B , Lapedriza A , Khosla A , et al. Places: A 10 Million Image Database for Scene Recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2018:1-1. [43]Chen Z, Jacobson A, Sunderhauf N, et al. Deep Learning Features at Scale for Visual Place Recognition[J]. IEEE, 2017: 3223-3230 [44]N Sünderhauf , Dayoub F , Shirazi S , et al. On the Performance of ConvNet Features for Place Recognition[C]// 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2015. [45]Liu Q, Duan F. Loop closure detection using CNN words[J]. Intelligent Service Robotics, 2019, 12(4): 303-318. [46]汪丹, 石朝侠, 王燕清. 基于非监督深度学习的闭环检测方法[J]. 计算机科学, 2020, 47(10): 228-232. [47]高翔. 视觉SLAM十四讲：从理论到实践[M]. 电子工业出版社, 2017. [48]Chen Z , Liu L , Inkyu S , et al. Learning Context Flexible Attention Model for Long-Term Visual Place Recognition[J]. IEEE Robotics and Automation Letters, 2018, 3:4015-4022. [49]林辉. 基于序列匹配的视觉SLAM闭环检测研究[D]. 广东:广东工业大学, 2019. [50]Arandjelovic R , Zisserman A . All About VLAD[J]. IEEE, 2013: 1578~1585 [51]Hervé Jégou , Ondrej Chum. Negative Evidences and Co - oc- curences in Image Retrieval: The Benefit of PCA and Whitening [C]. European Conference on Computer Vision. Springer, Ber- lin, Heidelberg, 2012: 774-787. [52]Neubert P , Niko Sünderhauf, Protzel P . Superpixel-based appearance change prediction for long-term navigation across seasons[J]. Robotics & Autonomous Systems, 2013, 69(1): 15-27. ﹀
中图分类号：	TP391.4
开放日期：	2021-06-18

附件下载