论文中文题名: | 基于VSLAM系统的室内语义建图算法研究 |
姓名: | |
学号: | 22207223055 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 085400 |
学科名称: | 工学 - 电子信息 |
学生类型: | 硕士 |
学位级别: | 工学硕士 |
学位年度: | 2025 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 视觉SLAM |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2025-06-14 |
论文答辩日期: | 2025-06-06 |
论文外文题名: | Research on Indoor Semantic Mapping Algorithm Based on VSLAM System |
论文中文关键词: | |
论文外文关键词: | Visual SLAM ; Visual Odometry ; Dense Mapping ; Mobile Robots ; Point Cloud Semantic Segmentation |
论文中文摘要: |
随着人工智能的发展,移动机器人研究正经历从功能导向到智能交互的转变。这一发展趋势对移动机器人的环境认知水平提出了更高要求,亟需构建更加智能化的环境感知与理解体系,因此对未知环境下的同步定位与建图(Simultaneous Localization and Mapping,SLAM)技术的研究变得尤为重要。当前移动机器人对语义层次的环境理解能力不足严重制约了机器人在未知环境中的应用。因此,本文以提升环境信息精度与语义层次为核心目标,针对现有建图方法与语义感知技术的不足展开创新性研究,致力于构建信息更丰富、表达更精确的语义地图。本文重点从以下三个方面开展深入研究: (1)ORB-SLAM3算法在稠密点云地图的构建过程中点云数据存在大量冗余信息,去噪处理过程中点云的特征完整性难以有效保持等问题会显著降低建图的精度。因此在ORB-SLAM3算法前端采用优化后的MAGSAC++算法剔除误匹配,提升位姿估计精度;再基于边界特征提取的分段点云滤波算法对稠密地图进行优化,该地图优化方法首先采用边界特征提取保持点云轮廓,同时结合快速引导滤波算法去除远距离噪声和稀疏离群点,并引入多尺度法线微分去噪算法,通过分析不同尺度下的法向量差异实现自适应分割,从而在有效去噪的同时保留关键几何特征。以TUM数据集上的fr1_room序列为例,实验结果表明本文算法相比ORB-SLAM3原稠密建图算法减少了33%的点云数量、降低了24.9%的内存占用,在显著减少冗余的同时保持了点云的几何特征,实现了高精度、低内存占用的地图构建。 (2)针对目前语义SLAM依赖二维图像进行语义分割映射到三维空间时物体存在的语义信息不完整问题,本文采用核点卷积神经网络KPConvX直接对三维点云进行语义分割,但是该网络存在过度分割的缺陷。因此引入空间约束正则化器以识别粗略对象区域并约束局部空间性,同时结合多尺度双注意力机制传递点云信息,增强关键特征的提取能力。在ScanNetV2和S3DIS数据集上的实验表明,改进后的网络在两个基准数据集上mIoU均有所提升,且有效缓解了过度分割现象,验证了本文算法的有效性和鲁棒性。 (3)最后,在室内场景下进行语义建图实验,将多层次语义信息融合至视觉SLAM系统。本文基于特征关联的语义融合机制和动态更新策略构建了完整的语义地图框架。通过公开数据集(TUM)和真实环境验证语义系统的有效性,实验表明本文实现了具有丰富语义标注的三维点云地图的构建,为室内场景的导航和语义感知提供了可靠的地图支持。 |
论文外文摘要: |
With the rapid advancement of artificial intelligence technology, research in mobile robotics is continuously progressing toward more natural interaction experiences and higher levels of intelligence. This advancement places greater demands on mobile robots' environmental perception capabilities, necessitating the development of a more intelligent environmental sensing and comprehension framework. Consequently, research on Simultaneous Localization and Mapping (SLAM) in unknown environments has become critically important. The current lack of semantic-level environmental understanding in mobile robots severely limits their applications in unknown environments. Therefore, this study focuses on enhancing the accuracy and semantic depth of environmental information, addressing the shortcomings of existing mapping methods and semantic perception technologies, and striving to construct a more informative and precise semantic mapping system. The study is conducted along the following three key dimensions: (1)During the construction of dense point cloud maps, the ORB-SLAM3 algorithm faces issues such as excessive redundant data in point clouds and the difficulty of effectively preserving feature integrity during denoising, which significantly reduces mapping accuracy. In the frontend of the ORB-SLAM3 algorithm, an optimized MAGSAC++ algorithm is adopted to eliminate mismatches and improve pose estimation accuracy. Subsequently, a segmented point cloud filtering algorithm based on boundary feature extraction is employed to optimize the dense map. The map optimization method first employs boundary feature extraction to maintain point cloud contours while combining a fast guided filtering algorithm to remove distant noise and sparse outliers. Additionally, a multi-scale normal differential denoising algorithm is introduced, which achieves adaptive segmentation by analyzing normal vector differences at different scales, thereby preserving key geometric features while effectively eliminating noise. Using the fr1_room sequence from the TUM dataset as an example, experimental results show that compared to ORB-SLAM3's original dense mapping algorithm, the proposed method reduces the point cloud size by 33% and decreases memory usage by 24.9%. It significantly reduces redundancy while maintaining geometric features, enabling high-precision, low-memory-occupancy map construction. (2)To address the issue of incomplete semantic information in current semantic SLAM systems, which rely on projecting 2D image-based semantic segmentation onto 3D space, this paper employs the Kernel Point Convolutional Neural Network (KPConvX) to perform direct semantic segmentation on 3D point clouds. However, this network suffers from over-segmentation. To mitigate this, a spatial constraint regularizer is introduced to identify coarse object regions and enforce local spatial consistency, while a multi-scale dual-attention mechanism is incorporated to propagate point cloud information and enhance the extraction of critical features. Experiments on the ScanNetV2 and S3DIS datasets demonstrate that the improved network achieves higher mIoU on both benchmark datasets and effectively alleviates over-segmentation, validating the efficacy and robustness of the proposed algorithm. (3)Finally, semantic mapping experiments are conducted in indoor environments by integrating multi-level semantic information into the visual SLAM system. This paper constructs a comprehensive semantic mapping framework based on a feature-correlated semantic fusion mechanism and a dynamic update strategy. The effectiveness of the semantic system is validated using both public datasets (TUM) and a real indoor data. Experimental results demonstrate that the proposed method successfully constructs 3D point cloud maps with rich semantic annotations, providing reliable mapping support for indoor navigation and semantic perception tasks. |
中图分类号: | TP391.4 |
开放日期: | 2025-06-16 |