查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于特征融合的高分辨遥感影像语义分割算法研究
姓名：	周新威
学号：	21207223090
保密级别：	公开
论文语种：	chi
学科代码：	085400
学科名称：	工学 - 电子信息
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2024
培养单位：	西安科技大学
院系：	通信与信息工程学院
专业：	电子信息
研究方向：	遥感图像分析与解译
第一导师姓名：	宋婉莹
第一导师单位：	西安科技大学
论文提交日期：	2024-06-14
论文答辩日期：	2024-05-31
论文外文题名：	Research on semantic segmentation algorithm of high-resolution remote sensing images based on feature fusion
论文中文关键词：	高分辨遥感影像语义分割 ; 全局上下文特征 ; 局部细粒度特征 ; 小波自注意力 ; 特征融合
论文外文关键词：	High-resolution remote sensing image segmentation ; global contextual features ; local fine-grained features ; wavelet self-attention ; feature fusion
论文中文摘要：	︿高分辨遥感影像语义分割是遥感图像分析与解译领域中不可或缺的关键环节，其目的是依据灰度、纹理等特征将图像分割成若干个互补交叠的区域，从而揭示图像的结构和本质，该研究在地表测绘覆盖、城市规划、灾害监测、地球变化研究等领域中具有非常广泛的应用。本文以高分辨率遥感影像为研究对象，以深度学习为基础理论，针对高分辨率遥感影像高类内方差和低类间方差的特点以及现有模型未能很好地融合多尺度深度特征的问题，展开基于全局与局部特征融合的高分辨遥感影像语义分割算法研究。本文的主要研究工作概括如下： 1）针对高分辨遥感影像高类内方差和低内间方差的特点带来的小目标特征难以识别的问题，本文提出了局部细粒度特征提取算法（LFMNet），该算法通过对特征图的位置信息和语义特征进行综合分析，这一创新步骤细化了特征图并产生了精细的局部特征。其中，该算法采用了编码器解码器为基础模型架构，在编码器中利用ResNet进行初步的特征提取。然后基于ResNet得到的特征，通过对特征图中的空间和语义信息进行编码，接着与全局池化信息进行比较分析，成功地捕获了在高分辨遥感图像的复杂背景中难以识别的复杂特征。这种增强功能提高了LFMNet识别小目标和描绘边界时的准确性，从而增强了本文模型的特征捕获和识别能力。 2）针对现有模型未能很好地利用多尺度全局上下文信息进行特征表示的问题，本文提出了全局特征提取算法（CAMNet），在CAMNet算法中，设计了一个协方差注意力模块，利用协方差矩阵为ResNet的不同阶段提供不同尺度的特征，随后利用图卷积对这些特征进行编码，进而捕获图像的全局上下文信息。协方差矩阵可以自适应地捕获特征图的局部和非局部上下文信息之间的线性关系，这种非局部上下文信息可以帮助模型理解图像中不同区域之间的关系，通过计算整个特征图的协方差矩阵即可导出图像的全局上下文信息。此外，协方差矩阵能够获取到图像像素之间的依赖关系，通过分析像素之间协方差矩阵的特征向量和特征值可以确定图像中的彼此相关的像素，这种相关信息可用于捕获图像的边界、纹理和颜色变化等语义特征。 3）针对最大化或合并类概率图无法确保全面有效的语义描述这一问题，本文在获取到全局与局部特征之后，考虑到复杂背景下的高分辨率遥感影像局部特征与全局特征之间存在的差异性和互补性，设计了一种小波自注意力机制以充分捕获图像高频分量中复杂细节和纹理信息的内在价值，此项创新不仅能够促进了全局和局部特征的有机融合，还可以充分促进高频信息和低频信息之间的协同相互作用，此外还确保了不同尺度的信息融合，从而优化了图像内容的全面利用。﹀
论文外文摘要：	︿ Semantic segmentation of high-resolution remote sensing images is an indispensable task in remote sensing image processing and analysis. It provides researchers with important conveniences in fields such as natural disaster early warning, urban planning, and environmental monitoring. To get closer to the research hot spots and practical applications, this paper takes high-resolution remote sensing images as the research object, uses deep learning as the basic theory, and focuses on the characteristics of high intra-class variance and low inter-class variance of high-resolution remote sensing images and the lack of existing models. It can make good use of multi-scale depth features and carry out research on semantic segmentation algorithms for high-resolution remote sensing images based on the fusion of global and local features. The main research work of this thesis is summarized as follows: (1) In order to solve the problem of difficult identification of small target features caused by the characteristics of high intra-class variance and low intra-class variance of high-resolution remote sensing images, this paper proposes a local fine-grained feature extraction algorithm (LFMNet) , this model uses an encoder-decoder structure, and ResNet is used in the encoder for preliminary feature extraction. Based on the features extracted by ResNet, a novel method is introduced by integrating the local feature extraction module. This innovative step refines the feature map and produces fine local features. By encoding spatial and semantic information in feature maps and then performing comparative analysis with global pooling information, complex features that are difficult to identify in the complex background of high-resolution remote sensing images are successfully captured. This enhancement improves the accuracy when identifying small targets and delineating boundaries, thereby enhancing the feature capture and recognition capabilities of our model. (2) In response to the problem that existing models cannot make good use of multi-scale global context information for feature representation, this paper proposes a global feature extraction algorithm (CAMNet) and designs a covariance attention module, which uses a covariance matrix as Different ResNet stages extract features at different scales, which are subsequently encoded through graph convolutions, which helps capture universally applicable and consistent global context information. The covariance matrix can not only adaptively capture the linear relationship between the local context information of the feature map, but also adaptively capture the linear relationship between the non-local context information of the feature map. Global context information is derived by computing the covariance matrix of the entire feature map. This non-local context information can help the model understand the relationship between different regions in the image. In addition, the covariance matrix can model the relationship between pixels. By analyzing the eigenvalues and eigenvectors of the covariance matrix, it is possible to determine which pixels in the image are related to each other. This correlation information can be used to capture important semantic features such as boundaries, texture, and color changes. (3) Aiming at the problem that simply maximizing or merging class probability maps cannot ensure comprehensive semantic description, after obtaining global and local features, taking into account the differences and interactions between global features and local features, it is recognized that the image has high In order to understand the inherent value of complex details and texture information in frequency components, this paper designs a wavelet self-attention mechanism. This innovation promotes the fusion of global and local features, leveraging the synergistic interplay between high- and low-frequency information. Importantly, this approach ensures the fusion of information at different scales, thereby optimizing the comprehensive utilization of image content. ﹀
参考文献：	︿ [1] 沈金悦. 基于卷积神经网络和注意力机制的高光谱遥感图像分类研究[D]. 青岛科技大学, 2022. [2] 周培诚, 程塨, 姚西文, 韩军伟. 高分辨率遥感影像解译中的机器学习范式[J]. 遥感学报, 2021, 25(01): 182-197. [3] Alganci, U.; Soydas, M.; Sertel, E. Comparative Research on Deep Learning Approaches for Airplane Detection from Very High-Resolution Satellite Images[J]. Remote Sens. 2020, 12, 458. [4] Briottet X, Boucher Y, Dimmeler A, et al. Military applications of hyperspectral imagery[C]. Targets and Backgrounds XII: Characterization and Representation. SPIE, 2006, 6239: 82-89. [5] Zhu, X.; Tuia, D.; Mou, L.; Xia, G.S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources[J]. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [6] Ma, L.; Liu, Y.; Liang Zhang, X.; Ye, Y.; Yin, G.; Johnson, B. Deep learning in remote sensing applications: A meta-analysis and review[J]. ISPRS J. Photogramm. Remote Sens. 2019, 152, 166–177. [7] Liang, C.; Xiao, B.; Cheng, B.; Dong, Y. XANet: An Efficient Remote Sensing Image Segmentation Model Using Element-Wise Attention Enhancement and Multi-Scale Attention Fusion[J]. Remote Sens. 2023, 15, 236. [8] Fan, J.; Xiong, Q.; Ye, Y.; Li, J. Combining phase congruency and self-similarity features for multimodal remote sensing image matching[J]. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [9] Yuan, P.Q.; Zhao, Y.; Zheng, X.; Hu, B. Capturing Small Objects and Edges Information for Cross-Sensor and Cross-Region Land Cover Semantic Segmentation in Arid Areas[J]. IEEE J. Remote Sens. 2023, 16, 983–997. [10] Cai, Y.; Fan, L.; Fang, Y. SBSS: Stacking-Based Semantic Segmentation Framework for Very High-Resolution Remote Sensing Image[J]. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5600514. [11] Lin G, Milan A, Shen C, et al. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1925-1934. [12] Prabhu, S.; Fleuret, F. Uncertainty Reduction for Model Adaptation in Semantic Segmentation[J]. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 9608–9618. [13] Liu, Y.; Zhang, W.; Wang, J. Source-Free Domain Adaptation for Semantic Segmentation[J]. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 1215–1224. [14] Chen, J.; Zhu, J.; Guo, Y.; Sun, G.; Zhang, Y.; Deng, M. Unsupervised Domain Adaptation for Semantic Segmentation of High-Resolution Remote Sensing Imagery Driven by Category-Certainty Attention[J]. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [15] Guan, D.; Huang, J.; Lu, S. Scale variance minimization for unsupervised domain adaptation in image segmentation[J]. Pattern Recognit. 2020, 112, 107764. [16] Stan, S.; Rostami, M. Domain Adaptation for the Segmentation of Confidential Medical Images[J]. arXiv 2021, arXiv:2101.00522. [17] Zhao, H.; Zhang, Y.; Liu, S.; Shi, J.; Loy, C.C.; Lin, D.; Jia, J. PSANet: Point-wise spatial attention network for scene parsing[J]. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 267–283. [18] Liu, R.; Tao, F.; Liu, X.; Na, J.; Leng, H.; Wu, J.; Zhou, T. RAANet: A Residual ASPP with Attention Framework for Semantic Segmentation of High-Resolution Remote Sensing Images[J]. Remote Sens. 2022, 14, 3109. [19] Yuan, M.; Ren, D.; Feng, Q.; Wang, Z.; Dong, Y.; Lu, F.; Wu, X. MCAFNet: A Multiscale Channel Attention Fusion Network for Semantic Segmentation of Remote Sensing Images[J]. Remote Sens. 2023, 15, 361. [20] Garcia-Garcia, A.; Orts, S.; Opera, S.; Villena-Martinez, V. A Review on Deep Learning Techniques Applied to Semantic Segmentation[J]. arXiv 2017, arXiv:1704.06857. [21] Davis, L.S.; Rosenfeld, A.; Weszka, J.S. Region extraction by averaging and thresholding[J]. IEEE Trans. Syst. Man Cybern. 1975, SMC-5, 383–388. [22] Adams, R.; Bischof, L. Seeded Region Growing[J]. IEEE Trans. Pattern Anal. Mach. Intell. 1994, 16, 641–647. [23] Kundu, M.K.; Pal, S.K. Thresholding for edge detection using human psychovisual phenomena[J]. Pattern Recognit. Lett. 1986, 4, 433–441. [24] Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2282. [25] He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition[J]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [26] Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation[J]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [27] Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation[J]. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [28] Badrinarayanan, V.; Handa, A.; Cipolla, R. Segnet: A deep convolutional encoder–decoder architecture for robust semantic pixel-wise labeling[J]. arXiv 2015, arXiv:1505.07293. [29] Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid scene parsing network[J]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [30] Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual Attention Network for Scene Segmentation[J]. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3146–3154. [31] Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module[J]. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 3–19 [32] Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks[J]. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [33] Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation[J]. arXiv 2017, arXiv:1706.05587. [34] Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation[J]. arXiv 2018, arXiv:1802.02611. [35] Zhang, H.; Dana, K.; Shi, J.; Zhang, Z.; Wang, X.; Tyagi, A.; Agrawal, A. Context Encoding for Semantic Segmentation[J]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7151–7160. [36] Li, R.; Wang, L.; Zhang, C.; Duan, C.; Zheng, S. A2-FPN for semantic segmentation of fine-resolution remotely sensed images[J]. Int. J. Remote Sens. 2022, 43, 1131–1155. [37] Liu, H.; Peng, P.; Chen, T.; Wang, Q.; Yao, Y.; Hua, X.S. FECANet: Boosting Few-Shot Semantic Segmentation with Feature-Enhanced Context-Aware Network[J]. arXiv 2023, arXiv:2301.08160. [38] Yang, M.; Yu, K.; Zhang, C.; Li, Z.; Yang, K. Denseaspp for semantic segmentation in street scenes[J]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3684–3692. [39] Li, R.; Duan, C. ABCNet: Attentive Bilateral Contextual Network for Efficient Semantic Segmentation of Fine-Resolution Remote Sensing Images[J]. arXiv 2021, arXiv:2102.0253. [40] Wang, L.; Xiao, P.; Zhang, X.; Chen, X. A Fine-Grained Unsupervised Domain Adaptation Framework for Semantic Segmentation of Remote Sensing Images[J]. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 4109–4121. [41] Liu, Y.; Chen, P.; Sun, Q. Covariance Attention for Semantic Segmentation[J]. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 1805–1818. [42] He, N.; Fang, L.; Li, S.; Plaza, J.; Plaza, A. Skip-Connected Covariance Network for Remote Sensing Scene Classification[J]. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 1461–1474. [43] Xiang, X.; Zhang, Y.; Jin, L.; Li, Z.; Tang, J. Sub-Region Localized Hashing for Fine-Grained Image Retrieval[J]. IEEE Trans. Image Process. 2022, 31, 314–326. [44] Wang Zhi, Wang Jia sheng, Yang Kun, et al. Semantic segmentation of high-resolution remote sensing images based on a class feature attention mechanism fused with Deeplabv3+[J]. Computers & Geosciences, 2022, 158:104969. [45] Lin B , Yang G , Zhang Q ,et al.Semantic Segmentation Network Using Local Relationship Upsampling for Remote Sensing Images[J].IEEE geoscience and remote sensing letters, 2022 [46] Wu, J.; Zhang, Y.; Zhao, X. Multi-Task Learning for Pavement Disease Segmentation Using Wavelet Transform[J]. 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 2022, pp. 1-8. [47] Chen, Y.; Rohrbach, M.; Yan, Z.; Shui, Y.; Feng, J.; Kalantidis, Y. Graph-Based Global Reasoning Networks[J]. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 433–442. [48] Bai, L.; Lin, X.; Hui, M. MsanlfNet: Semantic segmentation network with multiscale attention and nonlocal filters for high-resolution remote sensing images[J]. IEEE Trans. Geosci. Remote Sens. Lett. 2022, 19, 1–5. [49] Alsudays, J.; Wu, Y. K. Lai and Z. Ji, AFPSNet: Multi-Class Part Parsing based on Scaled Attention and Feature Fusion[J]. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2023, pp. 4022-4031. [50] Chen, C.F.; Fan, Q.; Panda, R. CrossViT: Cross-attention multi-scale vision transformer for image classification[J]. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; pp. 347–356. [51] Li, R.; Wang, L.; Zhang, C.; Duan, C.; Zheng, S. A2-FPN for semantic segmentation of fine-resolution remotely sensed images[J]. Int. J. Remote Sens. 2022, 43, 1131–1155 [52] Zhang,Y.; J. Cheng, H. Bai, Q. Wang and X. Liang Multilevel Feature Fusion and Attention Network for High-Resolution Remote Sensing Image Semantic Labeling[J] in IEEE Geoscience and Remote Sensing Letters, vol. 19, pp. 1-5, 2022, [53] Chen, F.; Liu, H.; Zeng, Z.; Zhou, X.; Tan, X. BES-Net: Boundary Enhancing Semantic Context Network for High-Resolution Image Semantic Segmentation[J]. Remote Sens. 2022, 14, 1638. ﹀
中图分类号：	TP75
开放日期：	2024-06-14

附件下载