查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于改进U型编解码的高分遥感影像建筑物分割方法研究
姓名：	王浩明
学号：	20206223069
保密级别：	公开
论文语种：	chi
学科代码：	085400
学科名称：	工学 - 电子信息
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2023
培养单位：	西安科技大学
院系：	电气与控制工程学院
专业：	控制工程
研究方向：	图像处理
第一导师姓名：	王征
第一导师单位：	西安科技大学
论文提交日期：	2023-12-20
论文答辩日期：	2023-12-11
论文外文题名：	Research on building segmentation method of high-resolution remote sensing image based on improved U-shaped codec structure
论文中文关键词：	高分遥感影像 ; 语义分割 ; 轻量化模型 ; 注意力机制 ; 建筑物提取
论文外文关键词：	High-resolution remote sensing image ; Semantic segmentation ; Lightweight model ; Attention mechanism ; Building extraction
论文中文摘要：	︿随着遥感卫星技术的迅速发展，卫星影像分辨率不断提高，获取到的高分遥感影像中具有更加丰富的细节信息。建筑物是遥感影像中最重要的地物信息之一，人工提取和更新建筑物信息费时费力。近年来基于深度学习的诸多智能化方法被研究者相继提出，但多数深度学习模型存在复杂度高、学习能力差的缺点。因此，本文在分析现有遥感影像建筑物提取方法的基础上，提出基于改进U型编解码的建筑物提取方法，本文主要研究内容如下：（1）针对高分遥感影像数据集数量较少以及图像质量不高的问题，对数据集进行必要的预处理工作。分别选取国内外遥感影像数据集作为本文研究对象，使用MobilenetV3代替原始SRGAN网络生成器的残差网络，并加入GAM注意力机制，对其进行超分辨率重建，使得图像的质量有所提升，最后使用随机扩增方法对数据集进行数量的扩充以丰富数据集，确保训练集和验证集的影像具有多样性。（2）针对卷积神经网络结构复杂、计算成本大的问题，使用编解码式卷积神经网络作为模型底层框架。在该模型的编码结构中，提出MobileT Network的特征提取结构，该结构是由深度卷积（DConV）以及翻转瓶颈卷积（TDConV）组成，并提出基于MobileT Network的轻量化建筑物语义分割算法，该算法编码器部分通过深度卷积和翻转瓶颈卷积提取建筑物特征，在运算次数和参数上减少模型运算负担。最终通过实验验证，本文模型参数量是Unet模型参数量的22%。（3）针对遥感影像分割中存在分割精度低、建筑物漏检的问题，本文在MobileT Network特征提取网络的基础上构建了SP_MobileT_Unet建筑物语义分割模型。首先，该模型通过引入SimAM注意力机制，在不增加额外参数量的前提下，可以辅助编码器通过位置信息更加精确地定位感兴趣的区域，从而提高对建筑类像素的关注度；其次，在编码器与解码器之间添加PPM金字塔池化模块，对编码器生成的特征图进行不同尺度大小的池化操作，并将不同层次的语义信息进行融合得到特征信息丰富的特征图，从而提高解码器对特征图的还原能力；最后，运用Focal损失函数平衡样本中建筑类和背景类的损失权重，使得模型的学习结果更加偏向建筑类。为验证本文提出的基于SP_MobileT_Unet的高分遥感影像建筑物分割模型，选取DeepLab V3、PPA-Net、Unet、CA-BASNet四种模型进行对比实验。本文模型在两类数据集上分别取得的分类准确率为90.46%和96.90%；最后，通过语义分割评价指标对模型进行评定，本文模型在PA、Recall、F1 Score指标上取得结果分别为96.9%、96.6%、96.7%，表明本文搭建的模型具有可行性。本文基于Unet的U型编解码结构以及MobileT Network的轻量化特征提取框架搭建了SP_MobileT_Unet模型，实现对高分遥感影像建筑物的分割提取。本文所提模型对遥感影像建筑物提取具有一定的参考价值，为城市规划、城乡建设和高精度地图绘制等提供有效理论。﹀
论文外文摘要：	︿ With the rapid development of remote sensing satellite technology, the resolution of satellite images is constantly improved, and the high-resolution remote sensing images has been and the obtained high-scoring remote sensing images have more detailed information. Building is one of the most important information about ground object in remote sensing image, manual extraction and updating of building information is time-consuming and laborious. In recent years, many intelligent methods based on deep learning have been proposed by researchers one after another, but most deep learning models have the disadvantages of high complexity and poor learning ability. Therefore, on the basis of analyzing existing remote sensing image building extraction methods, this thesis proposes a building extraction method based on depth-separable convolution. The main contributions are as follows: (1) In view of the low quantity and low quality of high-resolution remote sensing image dataset, the necessary pre-processing work is carried out on the dataset. Remote sensing image data sets at home and abroad were selected as the research object of this thesis, MobilenetV3 is used to replace the residual network of the original SRGAN network generator, and GAM attention mechanism is added to carry out super-resolution reconstruction to improve the image quality. Finally, random amplification method was used to expand the number of data sets to enrich the datasets and ensure the image diversity of training sets and validation sets. (2) In view of the complex structure and high computational cost of convolutional neural networks, codec convolutional neural networks are used as the underlying framework of the model. In the coding structure of the model, the feature extraction structure of MobileT Network is proposed, which is composed of depth convolution (DConV) and transpose bottleneck convolution (TDConV), and a lightweight building semantic segmentation algorithm based on MobileT Network is proposed. This algorithm The encoder part extracts building features through deep convolution and flipped bottleneck convolution, reducing the computational burden of the model in terms of the number of operations and parameters. Finally, through experimental verification, the parameter quantity of the model in this article is 22% of the parameter quantity of the Unet model. (3) In view of the problems of low segmentation accuracy and missing building detection in remote sensing image segmentation, in this thesis, the SP_MobileT_Unet building semantic segmentation model is constructed on the basis of MobileT Network feature extraction network. Firstly, by introducing the SimAM attention mechanism, the model can assist the encoder to locate the area of interest more accurately through the location information without increasing the number of additional parameters, so as to improve the attention of architectural pixels. Secondly, a pyramid pooling module is added between the encoder and decoder, and the feature map generated by the encoder is pooled in different scales and sizes, and the semantic information of different levels is fused to obtain the feature map with rich feature information, so as to improve the decoder's ability to restore the feature map. Finally, Focal loss function was used to balance the loss weights of buildings and background in the samples, so that the learning result of the model was more biased to buildings. In order to verify the building segmentation model of high-resolution remote sensing image based on SP_MobileT_Unet proposed in this thesis, DeepLab V3, PPA-Net, Unet and CA-BASNet models were selected for comparative experiments. The classification accuracy of the model in this thesis is 90.46% and 96.90% respectively on two kinds of data sets. Finally, the model was evaluated by semantic segmentation evaluation indicators. The results of PA, Recall and F1 Score were 0.969, 0.966 and 0.967, respectively, it indicates that the model built in this article is feasible. In this thesis, SP_MobileT_Unet model is built based on the U-shaped codec structure of Unet and the lightweight feature extraction framework of MobileT Network to realize the segmentation and extraction of buildings in high resolution remote sensing images. The model presented in this thesis has a certain reference value for remote sensing image building extraction, and provides effective theories for urban planning, urban and rural construction and high-precision map rendering. ﹀
参考文献：	︿ [1]李欣, 唐文莉, 杨博. 利用深度残差网络的高分遥感影像语义分割[J]. 应用科学学报, 2019, 37(02): 282-290. [2]何直蒙, 丁海勇, 安炳琪. 高分辨率遥感影像建筑物提取的空洞卷积E-Unet算法[J].测绘学报, 2022, 51(03): 457-467. [3]刘亦凡. 基于卷积神经网络的高分辨率遥感影像建筑物提取方法研究[D]. 北京: 中国矿业大学,2020. [4]卢彻, 徐胜华, 朱军. 改进U-Net的高分影像建筑物提取方法[J]. 测绘科学, 2021, 46(12): 140-146. [5]岱超, 刘萍, 史俊才, 等. 利用U型网络的遥感影像建筑物规则化提取[J]. 计算机工程与应用, 2023, 59(08): 105-116. [6]张玉鑫, 颜青松, 邓非. 高分辨率遥感影像建筑物提取多路径RSU网络法[J]. 测绘学报, 2022, 51(01): 135-144. [7]齐永菊, 裴亮, 叶国凤, 等. 高分辨率遥感影像建筑物提取方法研究[J]. 测绘与空间地理信息, 2018, 41(02): 119-123. [8]方留杨, 刘天逸, 赵孟云, 等. SVM结合多阈值分类的遥感影像公路水毁信息提取[J]. 人民长江, 2022, 53(11): 112-118. [9]王艳梅, 李金雨, 冯海霞. 改进支持向量机的遥感影像道路提取技术研究[J]. 浙江水利水电学院学报, 2021, 33(03): 74-76+86. [10]张庆云, 赵冬. 高空间分辨率遥感影像建筑物提取方法综述[J]. 测绘与空间地理信息, 2015, 38(04): 74-78. [11]李德仁, 童庆禧, 李荣兴, 等. 高分辨率对地观测的若干前沿科学问题[J]. 中国科学:地球科学, 2012, 42(06): 805-813. [12]王舒洋, 慕晓冬, 杨东方, 等. 融合高阶信息的遥感影像建筑物自动提取[J]. 光学精密工程, 2019, 27(11): 2474-2483. [13]Chen X , Xiang S , Liu C L , et al. Vehicle detection in satellite images by hybrid deep convolutional neural networks[C]// Pattern Recognition. IEEE, 2014: 1797-1801. [14]曹林林, 李海涛, 韩颜顺, 等. 卷积神经网络在高分遥感影像分类中的应用[J]. 测绘科学, 2016, 41(09): 170-175. [15]曲景影, 孙显, 高鑫. 基于CNN模型的高分辨率遥感图像目标识别[J]. 国外电子测量技术, 2016, 35(08): 45-50. [16]谢明鸿, 张亚运, 郑星星. 一种高分辨率遥感影像建筑物提取方法[J]. 信息技术, 2017(07): 32-35. [17]王丹. 一种高分辨率遥感影像建筑物边缘提取方法[J]. 环境保护与循环经济, 2009, 29(10): 26-28. [18]Xia L, Zhang X, Zhang J, Wu W, et al. Refined extraction of buildings with the semantic edge-assisted approach from very high-resolution remotely sensed imagery[J]. International Journal of Remote Sensing, 2020, 41(21). [19]徐佳, 陈媛媛, 黄其欢, 等. 综合灰度与纹理特征的高分辨率星载SAR图像建筑区提取方法研究[J]. 遥感技术与应用, 2012, 27(05): 692-698. [20]邓鸿儒, 崔宸洋, 单文龙, 等. 基于高分三号卫星SAR影像的城市建筑区提取[J]. 地理信息世界, 2018, 25(06): 79-84. [21]盛玉婷, 赵争, 王童童. 综合Span图和纹理特征的高分三号影像建筑区提取[J]. 北京测绘, 2020, 34(01): 73-78. [22]王燕红, 程博, 尤淑撑 , 等. 基于改进变差函数的高分辨率SAR图像建筑区提取[J]. 遥感信息, 2014, 29(02): 1-6. [23]V Vapnik , A Lerner. Recognition of patterns with help of generalized portraits[J]. Avtomat. i Telemekh. 1963, 1963: 774–780. [24]胡茂莹. 基于高分二号遥感影像面向对象的城市房屋信息提取方法研究[D]. 吉林: 吉林大学, 2016. [25]A Katartzis, H Sahli, E Nyssen, et al. Detection of buildings from a single airborne image using a Markov random field model[C]// International Geoscience and Remote Sensing Symposium. IEEE, 2001. [26]A Stassopoulou, T Caelli. Building detection using Bayesian networks [J]. International Journal of Pattern Recognition and Artificial Intelligence, 2000, 14(6): 689-714. [27]Huang X, Zhang L. Morphological building/shadow index for building extraction from high-resolution imagery over urban areas[J]. IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing, 2012, 5(1): 161-172. [28]吴炜, 骆剑承, 沈占锋, 等. 光谱和形状特征相结合的高分辨率遥感图像的建筑物提取方法[J]. 武汉大学学报(信息科学版), 2012, 37(07): 800-805. [29]N Gavankar, S Ghosh. Object based building footprint detection from high resolution multispectral satellite image using K-means clustering algorithm and shape parameters[J]. Geocarto International, 2018: 1-31. [30]周建伟, 吴一全. MRELBP特征、Franklin矩和SVM相结合的遥感图像建筑物识别方法[J]. 测绘学报, 2020, 49(03): 355-364. [31]G E Hinton, R R Salakhutdinov, et al. Reducing the dimensionality of data with neural networks.[J]. Science, 2006. [32]K Simonyan, A Zisserman. Very deep convolutional networks for large-scale image recognition[J]. Computer Science, 2014. [33]C Szegedy, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015:1-9. [34]He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016: 770-778. [35]Long J, E Shelhamer, T Darrell. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015: 3431-3440. [36]V Badrinarayanan, A Kendall, R Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. [37]O Ronneberger, P Fischer, T Brox. U-net: Convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, Cham, 2015: 234-241. [38]Zhao H , Shi J , Qi X , et al. Pyramid scene parsing network[J]. IEEE Computer Society, 2016: 6230-6239 [39]Chen L , G Papandreou , I Kokkinos , et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. [40]R Alshehhi, P Marpu, W Woon, et al. Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks[J]. Isprs Journal of Photogrammetry & Remote Sensing, 2017, 130(aug.): 139-149. [41]杨建宇, 周振旭, 杜贞容, 等. 基于SegNet语义模型的高分辨率遥感影像农村建设用地提取[J]. 农业工程学报, 2019, 35(05): 251-258. [42]S Saito, T Yamashita, Y Aoki. Multiple object extraction from aerial imagery with convolutional neural networks[J]. Electronic Imaging, 2016, 2016(10): 1-9. [43]刘文涛, 李世华, 覃驭楚. 基于全卷积神经网络的建筑物屋顶自动提取[J]. 地球信息科学学报, 2018, 20(11): 1562-1570. [44]FU J, LIU J, TIAN H, et al, Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA: IEEE, 2019: 3146-3154. [45]刘钊, 赵桐, 廖斐凡, 等 . 基于语义分割网络的高分遥感影像城市建成区提取方法研究与对比分析[J]. 国土资源遥感, 2021, 33(01): 45-53. [46]Chen M, Wu J, Liu L, et al. DR-net: an improved network for building extraction from high resolution remote sensing image[J]. Remote Sensing, 2021, 13(2): 294. [47]秦梦宇, 刘勇, 张寅丹, 等. 基于改进U-Net模型的高分辨率遥感影像中城市建筑物的提取[J]. 兰州大学学报(自然科学版), 2022, 58(02): 254-261+269. [48]Wang S , Hou X , Zhao X . Automatic building extraction from high-resolution aerial imagery via fully convolutional encoder-decoder network with non-local block[J]. IEEE Access, 2020, PP(99):1-1. [49]Liu Y , Gross L , Li Z , et al. Automatic building extraction on high-Resolution remote sensing imagery using deep convolutional encoder-decoder with spatial pyramid pooling[J]. IEEE Access, 2019, 7(1): 128774-128786. [50]高亦远, 佘江峰, 赵强, 等. 基于自注意力机制的遥感影像建筑物提取方法研究[J].信息技术与信息化, 2022, No.272(11): 5-8. [51]陈嘉浩, 邢汉发, 陈相龙. 融合级联CRFs和U-Net深度学习模型的遥感影像建筑物自动提取[J]. 华南师范大学学报(自然科学版), 2022, 54(01): 70-78. [52]李传林, 黄风华, 胡威, 等. 基于Res_AttentionUnet的高分辨率遥感影像建筑物提取方法[J]. 地球信息科学学报, 2021, 23(12): 2232-2243. [53]Shao Z, Tang P, Wang Z, et al. BRRNet: A fully convolutional neural network for automatic building extraction from high-resolution remote sensing images[J]. Remote Sensing, 2020, 12(6). [54]Ding L, Tang H , Lorenzo B. LANet: local attention embedding to improve the semantic segmentation of remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 59(1): 426-435 [55]Zhong Z, Lin Z, Rene B, et al. Squeeze-and-attention networks for semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 13065-13074. [56]E. Irwansyah, A. A. Santoso Gunawan, Nurhasanah, Deep Learning Model Comparison for Dense Building Segmentation in the City Using Aerial Imagery Data, 2022 International Conference on Science and Technology (ICOSTECH), Batam City, Indonesia, 2022, pp. 01-05 [57]Wang Z, Yang J, Deng J, et al. Image semantic segmentation algorithm based on adaptive fusion of multi-scale features[J]. Journal of Chinese Computer Systems, 2022, 43(4): 834-840. [58]A. Aizatin, I G B B Nugraha, Comparison of Semantic Segmentation Deep Learning Methods for Building Extraction, 2022 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM), Surabaya, Indonesia, 2022, pp. 1-5, [59]J Han, C Moraga. The influence of the sigmoid function parameters on the speed of backpropagation learning[C]// International Workshop on Artificial Neural Networks. Springer, Berlin, Heidelberg, 1995. [60]E Parkes , B Duffy . An automated tanh-function method for finding solitary wave solutions to non-linear evolution equations[J]. Computer Physics Communications, 1996, 98(3): 288-300. [61]G Hinton . Rectified linear units improve restricted boltzmann machines Vinod Nair[C]// International Conference on International Conference on Machine Learning. Omnipress, 2010. [62]E Maggiori, Y Tarabalka, G Charpiat, et al. Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark[C]//2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). IEEE, 2017: 3226-3229. [63]V Mnih. Machine learning for aerial image labeling[M]. University of Toronto (Canada), 2013. [64]何代毅, 施文灶, 林志斌, 等. 基于改进Mask-RCNN的遥感影像建筑物提取[J]. 计算机系统应用, 2020, 29(9): 156-163. [65]季顺平, 魏世清. 遥感影像建筑物提取的卷积神经元网络与开源数据集方法[J]. 测绘学报, 2019, 48(4): 448-459. [66]吴开顺, 郑道远, 陈妍伶, 等. 中国典型城市建筑物实例数据集[J]. 中国科学数据 (中英文网络版), 2021. [67]C LEDIG, L THEIS, F HUSZÁR，et al. Photo-realistic single image super-resolution using a generative adversarial network［C］//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Press, 2017: 4681-4690. [68]A Howard, M Sandler, Chu G, et al. Searching for mobilenetv3[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR). 2019: 1314-1324. [69]尹宝才, 王文通, 王立春. 深度学习研究综述[J]. 北京工业大学学报, 2015, 41(1):48-59. [70]M Tan, Q Le. Efficientnetv2: Smaller models and faster training[C]//International conference on machine learning. PMLR, 2021: 10096-10106. [71]F Chollet. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1251-1258. [72]张祥东, 王腾军, 朱劭俊, 等. 基于扩张卷积注意力神经网络的高光谱图像分类[J]. 光学学报, 2021, 41(3): 0310001. [73]Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018: 7132-7141. [74]S Woo, J Park, J Lee, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 3-19. [75]张越, 程春泉, 杨书成, 等. 融合双注意力机制模型的遥感影像建筑物提取[J]．测绘科学. 2022, 47(04): 129-136+174. [76]杨坚华; 张浩; 花海洋. 并行路径与强注意力机制遥感图像建筑物分割[J]. 光学精密工程, 2023, 31(02): 234-245. [77]Huang Liang, Zhu Juanjuan, Qiu Mulan, et al. CA-BASNet: a building extraction network in high spatial resolution remote sensing images[J]. Sustainability, 2022, 14(18): 11633. [78]赵迪, 叶盛波, 周斌. 基于Grad-CAM的探地雷达公路地下目标检测算法[J]. 电子测量技术, 2020(10): 6. ﹀
中图分类号：	TP391.4
开放日期：	2023-12-20

附件下载