查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于注意力机制的三维激光点云分类方法研究
姓名：	陈领
学号：	19210210084
保密级别：	公开
论文语种：	chi
学科代码：	085215
学科名称：	工学 - 工程 - 测绘工程
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2022
培养单位：	西安科技大学
院系：	测绘科学与技术学院
专业：	测绘工程
研究方向：	三维激光点云数据处理
第一导师姓名：	黄远程
第一导师单位：	西安科技大学
论文提交日期：	2022-06-26
论文答辩日期：	2022-06-09
论文外文题名：	Research on laser point cloud classification for attention mechanism
论文中文关键词：	点云分类 ; 多尺度特征 ; 交叉注意力 ; 局部空间注意力 ; 深度学习
论文外文关键词：	Point cloud classification ; Multi-scale feature ; Cross attention ; Local spatial position attention ; Deep learning
论文中文摘要：	︿随着激光雷达传感器的发展和普及，三维激光雷达点云数据的获取的成本大幅度下降，极大的推动了激光雷达点云数据的学术研究与行业应用。点云分类作为激光雷达点云数据重要的应用之一，广泛应用于电力线巡检、三维重建、林业检测等多个领域。三维激光点云数据分布散乱，点对之间没有严格的邻接关系。如何组织点云并形成空间关系是点云分类的基础。三维激光点云数据是对真实世界三维数字化的直接表达，真实世界中目标多样，不同目标之间尺度差异大，同类目标之间形态各异。此外，与遥感卫星图像相比，点云稠密度是点云的主要指标，但其往往呈现不均匀分布，这严重影响点云分类的精度。本文针对三维激光点云数据的特点，从点云局部特征提取和网络结构两个方面开展深入的研究，提出以下两个解决方案来提高点云分类的精度：（1）提出了联合局部空间位置注意力和多尺度特征的机载激光点云分类网络（Local spatial position attention and multi-scale feature net for ALS point cloud classification, AMMSF-Net）。针对机载激光点云数据中存在空间分布不均匀和地物尺度不一的问题，给点云精细化分类带来了巨大的挑战，本文提出了一种联合局部空间位置注意力和多尺度特征的机载激光点云分类。AMMSF-Net建立局部空间位置注意力层学习局部邻域上下文特征，注意力跳连机制将解码器和编码器中的特征进行动态融合并有效保留细节信息；解码器中的多尺度特征融合通过将不同尺度的特征进行级联输入到多层感知机和条件马尔可夫层得到最后的语义概率图，实现了不同尺度与不同层级特征图之间的相关，增强不同尺度目标的表达能力。在Vaihingen3D和DFC3D的实验结果表明，同其他方法相比，AMMSF-Net能有效提高点云地物类别区分的能力。（2）提出了一种交叉注意力特征增强和金字塔解码特征融合的点云分类网络（Cross Attention and Pyramid Decoding Feature Adaptive Fusion for Laser Point Cloud Classification, CAPDAF-Net）。针对仅以局部邻域几何信息作为分类特征不能有效提高点云分类精度的问题。CAPDAF-Net通过交叉注意力的方式增强局部邻域特征，挖掘局部邻域的上下文信息，并通过自适应融合的方式完成解码器金字塔结构的多尺度特征融合。在Toronto-3D和CSPC数据集上的实验结果表明，CAPDAF-Net有助于提升点云分类的精度。﹀
论文外文摘要：	︿ With the development and popularization of lidar sensors, the cost of acquiring 3D lidar point cloud data has been greatly reduced, which has greatly promoted the development of lidar point cloud data in academic research and industry applications. As one of the important applications of lidar point cloud data, point cloud classification is widely used in power line inspection, 3D reconstruction, forestry detection and so on. But the distribution of 3D laser point cloud data is scattered, and there is no strict adjacency relationship between point pairs. How to organize point clouds and form their spatial relationships are foundational for point cloud classification. In addition, compared with remote sensing images, density of point cloud often presents uneven distribution, which seriously affects the accuracy of point cloud classification. Considering the characteristics of 3D laser point cloud data, this paper conducts research on point cloud local feature extraction and network structure, and proposes the following two solutions to improve the accuracy of point cloud classification: (1) Local spatial position attention and multi-scale feature net for ALS point cloud classification (AMMSF-Net) was proposed. The uneven spatial distribution and scale variations between different categories bring challenges to the fine classification of point cloud data. In this section, an attention mechanism and multi-scale feature fusion network(AMMSF-Net) for ALS point cloud classification was proposed. In the network, a local spatial position attention layer was used to learn local contextual features; and an attention skip connection was added to dynamic fusion the corresponding features among the encoder and decoder, which can retain detail features and contextual information. The multi-scale feature in the decoder fusion module obtains the final semantic probability map by concatenating the features at different scales into MLP(Multilayer Perceptron) and CML(Conditional Markov Layer), which achieves the correlation of the feature maps between different scales and different levels, and enhances the expression of targets at different scales. Experimental results in Vaihingen3D and CSPC show that AMMSF-Net can distinguish ground objects in point cloud effectively compared with other methods. (2)A cross attention and pyramid decoding feature adaptive fusion for laser point cloud classification (CAPDAF-Net) was proposed. Considering the problem that only taking the local neighborhood geometric information as the classification feature can not effectively improve the classification accuracy of point cloud. CAPDAF-Net enhances local neighborhood features through cross-attention, mines the context information of local neighborhoods, and fuses multi-scale feature of decoder pyramid structure through the adaptive. Results in Toronto-3D and CSPC showed that CAPDAF-Net can help improve the accuracy of point cloud classification. ﹀
参考文献：	︿ [1]杨必胜, 董震. 点云智能研究进展与趋势 [J]. 测绘学报, 2019, 48(12): 1575-85. [2]张继贤, 林祥国, 梁欣廉. 点云信息提取研究进展和展望 [J]. 测绘学报, 2017, 46(10): 1460-1469. [3]景庄伟, 管海燕, 臧玉府，等. 基于深度学习的点云语义分割研究综述 [J]. 计算机科学与探索. 2021,15(01):1-26. [4]Wu H, Zhang X, Shi W, et al. An Accurate And Robust Region-Growing Algorithm For Plane Segmentation of TLS Point Clouds Using a Multiscale Tensor Voting Method[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019, 12(10): 4160-4168. [5]Luo N, Jiang Y, Wang Q. Supervoxel-Based Region Growing Segmentation for Point Cloud Data[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2021, 35(03): 2154007. [6]Xu Z, Zhang Z, Zhong R, et al. Content-Sensitive Multilevel Point Cluster Construction for ALS Point Cloud Classification[J]. Remote Sensing, 2019, 11(3): 342. [7]Zhang Z, Zhang L, Tong X, et al. A Multilevel Point-Cluster-Based Discriminative Feature for ALS Point Cloud Classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(6): 3309-3321. [8]Xu B, Jiang W, Shan J, et al. Investigation on the Weighted RANSAC Approaches for Building Roof Plane Segmentation from LiDAR Point Clouds [J]. Remote Sensing, 2016, 8(1):5. [9]Li L, Yang F, Zhu H, et al. An Improved RANSAC for 3D Point Cloud Plane Segmentation Based on Normal Distribution Transformation Cells [J]. Remote Sensing, 2017, 9(5). [10]康志忠, 王薇薇, 李珍. 多源数据融合的三维点云特征面分割和拟合一体化方法 [J]. 武汉大学学报（信息科学版）, 2013, 038(011): 1317-1321, [11]何明, 李勇, 方秀琴, 等. 一种随机抽样一致性算法的建筑物平面点云提取方法 [J]. 遥感信息, 2018, 33(03):104-107. [12]丁鸽, 燕立爽, 彭健, 等. 基于RANSAC算法的隧道点云横断面提取 [J]. 测绘通报, 2021, (09):120-123. [13]Yang L, Li Y, Li X, et al. Efficient Plane Extraction Using Normal Estimation and RANSAC from 3D Point Cloud[J]. Computer Standards & Interfaces, 2022, 82: 103608. [14]Su H, Maji S, Kalogerakis E, et al. Multi-view Convolutional Neural Networks for 3d Shape Recognition[C]//Proceedings of the IEEE International Conference on Computer Vision. 2015: 945-953. [15]Feng Y, Zhang Z, Zhao X, et al. Gvcnn: Group-view Convolutional Neural Networks For 3d Shape Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 264-272. [16]Boulch A, Guerry J, Le Saux B, et al. SnapNet: 3D Point Cloud Semantic Labeling with 2D Deep Segmentation Networks[J]. Computers & Graphics, 2018, 71: 189-198. [17]Guerry J, Boulch A, Le Saux B, et al. Snapnet-r: Consistent 3D Multi-View Semantic Labeling for Robotics[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops. 2017: 669-678. [18]Maturana D, Scherer S. Voxnet: A 3d Convolutional Neural Network for Real-Time Object Recognition[C]//2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2015: 922-928. [19]Wu Z, Song S, Khosla A, et al. 3D Shapenets: A Deep Representation For Volumetric Shapes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 1912-1920. [20]Riegler G, Osman Ulusoy A, Geiger A. Octnet: Learning Deep 3d Representations at High Resolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 3577-3586. [21]Klokov R, Lempitsky V. Escape from cells: Deep Kd-Networks for the Recognition of 3d Point Cloud Models[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 863-872. [22]Qi C R, Su H, Mo K, et al. Pointnet: Deep Learning on Point Sets for 3d Classification and Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 652-660. [23]Qi C R, Yi L, Su H, et al. Pointnet++: Deep Hierarchical Feature Learning on Point Sets in A Metric Space[J]. Advances in Neural Information Processing Systems, 2017:5099-5108. [24]Zhao H, Jiang L, Fu C W, et al. Pointweb: Enhancing Local Neighborhood Features for Point Cloud Processing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 5565-5573. [25]Jiang M, Wu Y, Zhao T, et al. PointSIFT: A Sift-Like Network Module for 3d Point Cloud Semantic Segmentation[C]//IGARSS 2019-2019 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2019: 5065-5068. [26]Wen C, Yang L, Li X, et al. Directionally Constrained Fully Convolutional Neural Network For Airborne Lidar Point Cloud Classification [J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 162: 50-62. [27]Li X, Wang L, Wang M, et al. DANCE-NET: Density-aware convolution networks with context encoding for airborne LiDAR point cloud classification[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 166: 128-139. [28]Hu Q, Yang B, Xie L, et al. Randla-net: Efficient semantic segmentation of large-scale point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 11108-11117. [29]Wang L, Huang Y, Hou Y, et al. Graph attention convolution for point cloud semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 10296-10305. [30]Chen C, Fragonara L Z, Tsourdos A. GAPNet: Graph Attention based Point Neural Network for Exploiting Local Feature of Point Cloud [J].arXiv preprint arXiv:1905.08705, 2019. [31]Yang J, Zhang Q, Ni B, et al. Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3323-3332. [32]Xie S, Liu S, Chen Z, et al. Attentional ShapeContextNet for Point Cloud Recognition[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4606-4615. [33]Chen L, Chen W, Xu Z, et al. DAPnet: A Double Self-Attention Convolutional Network for Point Cloud Semantic Labeling[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 9680-9691. [34]Guo M-H, Cai J-X, Liu Z-N, et al. PCT: Point Cloud Transformer [J]. Computational Visual Media, 2021, 7(2): 187-199. [35]Lecun Y, Bottou L. Gradient-based Learning Applied to Document Recognition [J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [36]Krizhevsky A, Sutskever I, Hinton G E. ImageNet Classification with Deep Convolutional Neural Networks [J]. Commun ACM, 2017, 60(6): 84–90. [37]Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. arXiv preprint arXiv:1409.1556, 2014. [38]Szegedy C, Liu W, Jia Y, et al. Going Deeper with Convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 1-9. [39]He K, Zhang X, Ren S, et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on Imagenet Classification[C]//Proceedings of the IEEE International Conference on Computer Vision. 2015: 1026-1034. [40]Long J, Shelhamer E, Darrell T. Fully Convolutional Networks for Semantic Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440. [41]Badrinarayanan V, Kendall A, Cipolla R. Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. [42]Ronneberger O, Fischer P, Brox T. U-net: Convolutional Networks for Biomedical Image Segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2015: 234-241. [43]Zhou Z, Siddiquee M, Tajbakhsh N, et al. UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation [J]. IEEE Transactions on Medical Imaging, 2020, 39(6): 1856-1867. [44]Zhao H, Shi J, Qi X, et al. Pyramid Scene Parsing Network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 2881-2890. [45]Florian L C, Adam S H. Rethinking Atrous Convolution for Semantic Image Segmentation[C]//Conference on Computer Vision and Pattern Recognition (CVPR). IEEE/CVF. 2017. [46]Wang J, Sun K, Cheng T, et al. Deep High-Resolution Representation Learning for Visual Recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(10): 3349-3364. [47]Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need [C]. //Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, California, USA; Curran Associates Inc. 2017: 6000–60010 [48]Chen L Z, Li X Y, Fan D P, et al. LSANet: Feature Learning On Point Sets by Local Spatial Aware Layer [C]//VISIGRAPP (4: VISAPP). 2022: 168-179. [49]李道纪, 郭海涛, 卢俊, 等. 遥感影像地物分类多注意力融和U型网络法 [J]. 测绘学报, 2020, 49(08): 1051-1064. [50]Wang L, Huang Y, Shan J, et al. MSNet: Multi-Scale Convolutional Network for Point Cloud Classification[J]. Remote Sensing, 2018, 10(4): 612. [51]Niemeyer J, Rottensteiner F, Soergel U. Contextual Classification of Lidar Data And Building Object Detection in Urban Areas[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2014, 87: 152-165. [52]Bosch M, Foster K, Christie G, et al. Semantic Stereo for Incidental Satellite Images[C]//2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2019: 1524-1532. [53]Wen C, Li X, Yao X, et al. Airborne LiDAR point cloud classification with global-local graph attention convolution neural network [J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 173: 181-194. [54]Yousefhussien M, Kelbe D J, Ientilucci E J, et al. A Multi-Scale Fully Convolutional Network for Semantic Labeling of 3D Point Clouds[J]. ISPRS Journal Of Photogrammetry and Remote Sensing, 2018, 143: 191-204. [55]Yang Z, Tan B, Pei H, et al. Segmentation and Multi-Scale Convolutional Neural Network-Based Classification of Airborne Laser Scanner Data[J]. Sensors, 2018, 18(10): 3347. [56]Zhao R B, Pang M Y, Wang J D. Classifying Airborne Lidar Point Clouds via Deep Features Learned by a Multi-Scale Convolutional Neural Network [J]. International Journal of Geographical Information Science, 2018, 32(5): 960-979. [57]Li Y, Bu R, Sun M, et al. PointCNN: Convolution on X-Transformed Points[J]. Advances in Neural Information Processing Systems, 2018, 31. [58]Thomas H, Qi C R, Deschaud J E, et al. Kpconv: Flexible and Deformable Convolution for Point Clouds[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 6411-6420. [59]Tan W, Qin N, Ma L, et al. Toronto-3D: A Large-Scale Mobile Lidar Dataset for Semantic Segmentation of Urban Roadways[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020: 202-203. [60]Tong G, Li Y, Chen D, et al. CSPC-dataset: New Lidar Point Cloud Dataset and Benchmark for Large-scale Scene Semantic Segmentation[J]. IEEE Access, 2020, 8: 87695-87718. [61]Wang Y, Sun Y, Liu Z, et al. Dynamic Graph CNN for Learning on Point Clouds[J]. ACM Transactions on Graphics (tog), 2019, 38(5): 1-12. [62]Ma L, Li Y, Li J, et al. Multi-scale Point-Wise Convolutional Neural Networks for 3D Object Segmentation from Lidar Point Clouds in Large-scale Environments[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 22(2): 821-836. [63]Li Y, Ma L, Zhong Z, et al. TGNet: Geometric Graph CNN on 3-D Point Cloud Segmentation [J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(5): 3588-600. [64]Huang J, You S. Point Cloud Labeling Using 3d Convolutional Neural Network[C]//2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2016: 2670-2675. [65]Hackel T, Savinov N, Ladicky L, et al. Semantic3D.net: A new Large-scale Point Cloud Classification Benchmark [J]. Photogrammetric Engineering & Remote Sensing, 2018, 84(5): 297-308. ﹀
中图分类号：	TP751
开放日期：	2022-06-27

附件下载