论文中文题名: | 单目视觉惯导与激光雷达多模态鲁棒定位和一致建图SLAM算法研究 |
姓名: | |
学号: | 16105301002 |
保密级别: | 保密(1年后开放) |
论文语种: | chi |
学科代码: | 080202 |
学科名称: | 工学 - 机械工程 - 机械电子工程 |
学生类型: | 博士 |
学位级别: | 工学博士 |
学位年度: | 2022 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 机器人技术 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2023-01-10 |
论文答辩日期: | 2022-12-06 |
论文外文题名: | Research on the SLAM Algorithm of Multimodal Monocular Visual-Inertial-LiDAR Robust Localization and Consistent Mapping |
论文中文关键词: | 移动机器人导航 ; 同步定位与建图 ; 多模态 ; 单目视觉 ; 惯性测量单元 ; 激光雷达移动机器人导航 ; 同步定位与建图 ; 多模态 ; 单目视觉 ; 惯性测量单元 ; 激光雷达 |
论文外文关键词: | Mobile Robot Navigation ; Simultaneous Localization and Mapping ; Multimodal ; Monocular Visual ; Inertial Measurement Unit ; LiDAR |
论文中文摘要: |
同步定位与建图(Simultaneous Localization and Mapping, SLAM)是移动机器人自主导航的核心技术。为了解决当前SLAM算法在传感器退化场景下的定位鲁棒性不足以及大尺度高度变化场景的建图一致性难题,构建由单目相机、六轴低精度惯导(MEMS-IMU)、机械旋转式三维激光雷达,三种传感器组成的mVIL(monocular Visual-Inertial-LiDAR)多模态感知系统。开展单模态数据关联机理、在线全自动时空初始化、局部混耦合双滑窗优化与全局增量平滑优化的mVIL-SLAM算法研究,为同时融合视觉惯导与激光雷达的多模态同步定位与建图问题提供一种完整的解决方案,提升移动机器人的导航定位建图水平。 研究三种传感器的单模态数据关联机理。针对多模态系统初始化外参估计时,单目视觉位姿初值计算需求,基于运动结构恢复(Structure From Motion, SFM)构建了一种线性数值计算与非线性光束平差(Bundle Adjustment, BA)优化的多视图几何快速测量算法,通过求解固定规模的单目相机位姿及其观测特征点三维坐标,实现了快速的相机姿态初始估计,实验中指出单目相机存在尺度飘移问题,为多模态初始化算法中引入尺度变量提供了理论基础;针对融合惯导后的多模态SLAM数学建模在不同领域存在多种定义问题,约定了全文一致规约,构建了本文低精度IMU简化模型下的离散预积分测量模型,推导了李群空间右扰动模型下的预积分误差传递模型;针对现有激光雷达数据关联算法中反射强度信息的缺失,提出一种带有反射强度校验的激光雷达特征提取与匹配方法,该方法在微小增加计算量的前提下,有效提升了特征提取与匹配的准确性,对位姿估计提供了正向促进作用。 针对目前多模态SLAM初始化算法外参标定与时间同步功能不完善,面对传感器二次布放时需离线重复标定外参以及当系统无硬件时间同步时难以初始化等问题,提出了一种在线全自动时空初始化mVIL-INIT算法。该算法无需任何环境先验证知识,无需额外标志物,无需特殊运动模式,可实现mVIL系统初始化外参自标定与时间软同步。针对同时融合三种传感器时的初始化状态量多,且具有捆绑效应难题,提出依次进行视觉惯性VI初始化与激光惯性LI初始化的两阶段初始化方法。首先利用旋转约束在表征状态空间进行线性求解,得到旋转外参初值,随后在真实状态空间下利用旋转与平移约束执行非线性优化,以由粗到精的方式有效解决了初始化状态量的捆绑效应。同时,提出的初始化算法考虑了更加全面的状态空间并加以约束计算,使mVIL系统在不同场景、不同安装方式下均可实现鲁棒且快速的初始化。在公共数据集与自制数据集下详细测评了mVIL-INIT算法,验证了其高效性与鲁棒性。 针对传统滤波框架多模态SLAM算法面对强非线性状况时的鲁棒性不足,传统优化框架深度紧耦合计算量大且算法扩展性差等问题,提出一种面向局部鲁棒定位的结合松耦合与紧耦合特性的多模态混耦合mVIL-OAM算法。算法前端为以松耦合方式利用视觉惯性里程计(Visual-Inertial-Odometry, VIO)插值预测,得到激光雷达里程计(LiDAR-Odometry, LO),同时利用VIO去除激光点云畸变,并在单位球面与视觉特征点深度关联;算法后端提出一种紧耦合双滑窗优化模型,将激光雷达深度约束、帧间约束(VIO状态监测功能)、帧图约束(局部地图构建功能)以伴随滑窗的方式引入视惯主滑窗,增强了算法前端的鲁棒性与精度。mVIL-OAM算法前后端鲁棒性与精度均优于类似架构的单模态或多模态SLAM系统,综合了三种传感器优势,具备视觉/激光退化场景下的鲁棒定位能力。 针对现有激光雷达建图算法高度易飘移、闭环难检测等问题,在mVIL-OAM基础上提出一种面向全局一致建图的多模态增量平滑mVIL-SAM算法。以mVIL-OAM多帧结果合并作为关键帧,以轨迹、高度与扫描语境Scan-Context三种方式进行闭环检测,可实现高度变化场景的稳定闭环。mVIL-SAM以因子图增量平滑的方式更新后端位姿,因子图包含初始因子、高度先验因子、关联里程因子、闭环因子,其中高度先验来自于视觉惯性前端。全局地图构建时针对室内外场景特点分别采用ikdTree与Octotree数据结构维护。mVIL-SAM以前、中、后端的三级式算法架构,实现了全局定位与建图的一致性,相较于现有面向全局的开源多模态SLAM算法具备高度变化场景下的一致建图优势。 最后,自主搭建移动机器人平台,评价提出的定位建图算法在移动机器人导航时的实际表现。在煤矿模拟实验室中进行了飞行机器人导航在线定位实验,结果表明提出的mVIL-OAM算法具备局部厘米级导航在线定位能力。在校园环境中进行了轮式机器人导航在线建图实验,结果表明提出的mVIL-SAM算法具备全局分米级一致建图能力。 |
论文外文摘要: |
Simultaneous Localization and Mapping (SLAM) is the core technology for autonomous navigation of mobile robots. In order to solve the problems that the current SLAM algorithm has insufficient robustness of localization in the sensor degradation scenario and the difficulty of guaranteeing the consistency of mapping in the large scale height change scenario, the mVIL (monocular Visual-Inertial-LiDAR) multimodal sensing system, consisting of a monocular camera, six-axis low-precision MEMS-IMU, and mechanical rotational 3D LiDAR, three sensors, is constructed. The unimodal data correlation mechanism, online fully automatic spatio-temporal initialization, local mixed-coupling dual sliding window optimisation, and global incremental smoothing optimization algorithm are studied, which provides a complete solution to the multimodal simultaneous localization and mapping problem of integrating Visual-Inertial-LiDAR at the same time, and improves the navigation ability of mobile robots. The three single-modal data association models used in this dissertation are studied and constructed. First, for initial external parameter estimation of multimodal systems, a linear numerical computation and nonlinear Bundle Adjustment (BA) optimized multi-view geometry measurement algorithm based on Structure From Motion (SFM) is constructed to address the need for initial monocular visual pose calculation. The algorithm achieves fast initial estimation of the camera pose by solving the fixed-scale monocular camera poses and its observed features. The experimental results show that the monocular camera has a scale drift problem, which provides a theoretical basis for introducing scale variables in the multimodal initialization algorithm. To address the problem of multiple definitions in different domains in the mathematical modeling of multimodal SLAM after fused IMU measurement, a full-text consistent statute has been agreed. A discrete pre-integrated measurement model under the simplified model of low-precision IMU in this dissertation is constructed, and a pre-integrated error transfer model under the right perturbation model in Lie group space is derived. Finally, a LiDAR feature extraction and matching method with reflection intensity check is proposed for the lack of reflection intensity information in the existing LiDAR data association algorithm, which effectively improves the accuracy of feature extraction and matching with a small increase in computational effort while providing a positive contribution to the LiDAR state estimation. For the existing multimodal SLAM initialization algorithm does not yet have a complete external parameter calibration and time synchronization function, when facing the second deployment of sensors, the external parameters need to be recalibrated offline; when the system does not have hardware time synchronization, it is difficult to initialize; and other dilemmas, an online initialization algorithm is proposed: mVIL-INIT. The algorithm requires no prior knowledge of the environment, no additional markers, no special motion patterns, and is the first online initialization algorithm for mVIL systems with external calibration and soft time synchronization. In order to solve the problems of the initialization state number being large and the bundling effect when fusing three sensors, we propose to complete the initialization in two stages: Visual-Inertial (VI) initialization and LiDAR Inertial (LI) initialization. In the two stages, we first use the rotation constraint to perform the linear solution in the representation state space to obtain the initial value of the rotation external parameter, and then use the rotation and translation constraints to perform the nonlinear optimization in the real state space, which can effectively solve the bundling effect of the initialization states in a coarse-to-fine way. At the same time, the proposed initialization algorithm considers a fuller set of state quantities and constrains the computation so that the mVIL system can achieve robust and fast initialization in different scenarios and different installation positions. The mVIL-INIT algorithm's effectiveness, efficiency, and robustness are thoroughly assessed on both public and self-collected datasets. In response to the current filtering framework multimodal SLAM algorithm's lack of robustness in the face of strong nonlinear conditions, as well as the optimization framework's computationally intensive and poorly scalable depth of tight coupling, a locally oriented mixed-coupling multimodal SLAM algorithm combining loose and tight coupling is proposed: mVIL-OAM. The frontend of the algorithm uses Visual-Inertial-Odometry (VIO) interpolation predictions to obtain LiDAR Odometry (LO) in a loosely coupled manner, and uses VIO to remove LiDAR point cloud distortion. Then, correlate the undistorted point cloud with visual feature points in depth on the unit sphere. A tightly coupled dual sliding window optimization model is proposed at the back-end of the algorithm, which introduces LiDAR depth constraint, scan-to-scan constraint (VIO status monitoring function), and scan-to-map constraint (local mapping function) into the main visual-inertial sliding window in the form of friend sliding windows to enhance the robustness and accuracy of the front-end algorithm. mVIL-OAM algorithm front-end and back-end robustness and accuracy are better than those of singlemodal or multimodal SLAM systems with similar architectures. The system combines the advantages of the three sensors' measurements and has the ability to localize and map in visually degraded and LiDAR-degraded scenes. In response to the problems of existing LiDAR mapping algorithms such as altitude drift and difficult detection of loop closing, a global multimodal SLAM algorithm, mVIL-SAM, with multi-layer and globally consistent real-time mapping and localization capability, is proposed based on mVIL-OAM. The algorithm uses mVIL-OAM multi-frame results merged as the key frame to perform loop-closure detection in three ways: trajectory, altitude, and environment descriptor, which can achieve stable closed-loop for altitude change scenes; mVIL-SAM updates the back-end poses by incremental smoothing of the pose-only factor graph, which contains an initial factor, an altitude prior factor, a between factor, and a loop factor, where the altitude prior comes from the front-end VIO. The global map is constructed using ikdTree and Octotree data structures for indoor and outdoor scenes, respectively. mVIL-SAM has a three-level algorithm architecture of frontend, midend, and backend that achieves consistent global localization and mapping, which has advantages over current global-oriented open source multimodal SLAM algorithms in highly variable scenarios. Finally, the mobile robot platforms were self-assambled to evaluate the practical performance of the proposed SLAM algorithm for online navigation of the mobile robot. Experiments on online localization for flying robot navigation were conducted in a coal mine simulation laboratory, and the results show that the proposed mVIL-OAM algorithm possesses local centimeter-level navigation online localization capability. Experiments on online mapping for wheeled robot navigation were conducted in a campus environment, and the results show that the mVIL-SAM algorithm has a global sub-meter level consistent mapping capability. |
参考文献: |
[3]杨元喜. 北斗卫星导航系统的进展、贡献与挑战[J]. 测绘学报, 2010,39(01):1-6. [7]严恭敏. 惯性仪器测试与数据分析[M]. 国防工业出版社, 2012. [17]陈松柏.实时的归一化相关匹配算法[J].信息与电子工程,2006(06):461-463. [22]Trajković M, Hedley M. Fast corner detection[J]. Image and vision computing, 1998, 16(2): 75-87. [28]Barfoot T D. State estimation for robotics[M]. Cambridge University Press, 2017. [29]高翔, 汽车工程. 视觉 SLAM 十四讲: 从理论到实践[M]. 电子工业出版社, 2017. [42]Segal A, Haehnel D, Thrun S. Generalized-icp[C]//Robotics: Science and Systems. 2009, 2(4): 435. [76]Derpanis K G. The harris corner detector[J]. York University, 2004, 2. [107]张宗华, 刘巍, 刘国栋,等. 三维视觉测量技术及应用进展[J]. 中国图象图形学报, 2021,26(06):1483-1502. [108]刘艳, 李腾飞. 对张正友相机标定法的改进研究[J]. 光学技术, 2014(6):6. [115]哈特利. 计算机视觉中的多视图几何(第2版)[M]. 安徽大学出版社, 2020. [116]李庆扬. 数值分析[M]. 清华大学出版社有限公司, 2001. [123]周红进, 钟云海, 易成涛. MEMS惯性导航传感器[J]. 舰船科学技术, 2014(1):7. [125]孙丽, 秦永元. 捷联惯导系统姿态算法比较[J]. 中国惯性技术学报, 2006(03):6-10. [126]熊有伦. 机器人技术基础[M]. 机械工业出版社, 1996. [150]The Schur complement and its applications [M]. Springer Science & Business Media, 2006. [162]马凯, 林义忠. 移动机器人视觉导航技术综述[J]. 物流科技, 2020, 43(10): 39-41. [163]易柯敏, 沈艳霞. 激光 SLAM 导航移动机器人定位算法研究综述[J]. 机器人技术与应用, 2019 (5): 25-28. |
中图分类号: | TP242.6 |
开放日期: | 2024-03-21 |