- 无标题文档
查看论文信息

论文中文题名:

 基于卷积神经网络的遥感图像语义分割研究    

姓名:

 马亦骏    

学号:

 19208207026    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 085211    

学科名称:

 工学 - 工程 - 计算机技术    

学生类型:

 硕士    

学位级别:

 工程硕士    

学位年度:

 2022    

培养单位:

 西安科技大学    

院系:

 计算机科学与技术学院    

专业:

 计算机技术    

研究方向:

 图像处理    

第一导师姓名:

 厍向阳    

第一导师单位:

 西安科技大学    

论文提交日期:

 2022-06-21    

论文答辩日期:

 2022-06-07    

论文外文题名:

 Research on Semantic Segmentation of Remote Sensing Image Based on Convolutional Neural Network    

论文中文关键词:

 遥感图像 ; 语义分割 ; 全尺度跳层连接 ; 全局特征 ; 局部特征    

论文外文关键词:

 Remote Sensing Image ; Semantic Segmentation ; Full-Scale Skip Connection ; Global Feature ; Local Feature    

论文中文摘要:

随着我国遥感技术的快速发展,遥感卫星的探测能力日益提升,传统的遥感图像分割方法无法满足精准实时地对遥感图像关键信息进行自动提取的要求。近年来,由于深度学习技术在计算机视觉领域的广泛应用,自然图像的语义分割方法取得了较大的进展。由于遥感图像中目标物体较多、背景信息复杂,将现有的自然图像语义分割方法直接应用于遥感图像中无法取得较好的结果。因此,本文针对遥感图像的特点,结合自然图像语义分割方法对遥感图像语义分割模型进行研究。主要的研究工作具体如下:

(1) 针对遥感图像中多个目标聚集导致边缘混淆,小尺度物体分割不明显,以及语义分割过程中对全局信息获取不足的问题,提出了一种基于混合注意力与全尺度跳层连接网络的遥感图像语义分割算法DU-net。该算法以U-net3+作为基础网络,采用全尺度跳层连接网络作为特征提取网络,摒弃了原算法中的深度监督,建立特征与注意力机制之间的关联,最终实现语义分割的过程。实验结果表明:DU-net算法在不同指标下较经典算法都有明显提升,同时提高了图像边缘分割质量,改善了算法对小尺度目标的分割的准确度。

(2) 针对当前多数遥感图像语义分割模型存在训练速度慢,网络层数多,参数量大等问题,提出了一种基于全局特征与局部特征交互的轻量级遥感图像语义分割算法EFLG-Net。该算法以EfficientNetB0作为特征提取网络,引入全局特征路径,建立全局特征与局部特征之间的联系,改进了原算法中的卷积模块MBConv,提出新模块FU-MBConv,并优化网络结构和参数,再经过反卷积操作与全局特征路径建立联系,最后实现语义分割的过程。实验结果表明:EFLG-Net算法在模型参数大小、训练速度、模型精度上都有了明显的提升。

论文外文摘要:

With the rapid development of modern remote sensing technology in China, the exploration ability of remote sensing satellites has grown, the traditional remote sensing image segmentation method which is unable to meet accurate real-time automatic extraction of remote sensing image data. Semantic segmentation methods of natural images have advanced significantly in recent years, thanks to the widespread use of deep learning technologies in the field of computer vision. Because the boundaries between various items in remote sensing photos are easily obscured by the huge number of tiny and medium scale objects, existing natural image semantic segmentation methods cannot be directly applied to remote sensing images. As a result, this study investigates the semantic segmentation model of remote sensing photos, taking into account the peculiarities of remote sensing images as well as the semantic segmentation approach of natural images. The following is the main research project:

(1) Aiming at the problems of edge confusion caused by multiple objects gathering in remote sensing images, unclear segmentation of small scale objects, and insufficient acquisition of global information in semantic segmentation process, a remote sensing image semantic segmentation algorithm DU-net based on mixed attention and full-scale skip connection network was proposed. In this algorithm, U-net3+ is used as the basic network, and full-scale layer-hopping network is used as the feature extraction network. The depth supervision in the original algorithm is abandoned, the association between feature and attention mechanism is established, and finally the process of semantic segmentation is realized. The experimental results show that the DU-net algorithm has a significant improvement over the classical algorithm under different indexes, and improves the image edge segmentation quality and the accuracy of the algorithm for the segmentation of small scale targets.

(2) A lightweight semantic segmentation algorithm for remote sensing images based on interaction between global and local features, EFLG-Net, was proposed in response to problems found in most current semantic segmentation models of remote sensing images, such as slow training speed, many network layers, and a large number of parameters. The feature extraction network was EfficientNetB0, and the algorithm introduced the global feature path, established the connection between global and local features, improved the convolution module MBConv in the original algorithm, proposed a new module FU-MBconv, optimized the network structure and parameters, and then connected to the global feature path through deconvolution operations. Finally, the semantic segmentation procedure is completed. The EFLG-Net technique improves model parameter size, training time, and model correctness, according to experimental data.

参考文献:

[1] 李丹, 吴保生, 陈博伟, 薛源, 张翼. 基于卫星遥感的水体信息提取研究进展与展望[J]. 清华大学学报(自然科学版), 2020, 60(02): 147-161.

[2] 刘海秋, 任恒奎, 牛鑫鑫, 夏萍. 基于Sentinel-2遥感影像的巢湖蓝藻水华提取方法研究[J]. 生态环境学报, 2021, 30(01): 146-155.

[3] 殷博灵, 余阳, 苏玲,等. 基于高分一号卫星遥感数据提取城市建设用地方法研究[J]. 地球科学前沿(汉斯), 2019, 9(5): 7.

[4] 李德仁, 丁霖, 邵振峰. 面向实时应用的遥感服务技术[J]. 遥感学报, 2021, 25(01): 15-24.

[5] 李增元, 陈尔学. 中国林业遥感发展历程[J]. 遥感学报, 2021, 25(01): 292-301.

[6] 邝辉宇, 吴俊君. 基于深度学习的图像语义分割技术研究综述[J]. 计算机工程与应用, 2019, 55(19): 12-21+42.

[7] 薛洪飞. 基于深度学习的遥感图像分类方法研究[D]. 哈尔滨: 哈尔滨工程大学, 2019.

[8] 王宇浩. 基于深度学习的遥感图像语义分割问题研究[D]. 北京: 北京科技大学, 2020.

[9] 李道纪, 郭海涛, 卢俊, 赵传, 林雨准, 余东行. 遥感影像地物分类多注意力融和U型网络法[J]. 测绘学报, 2020, 49(08): 1051-1064.

[10] 张刚. 基于深度学习的遥感图像语义分割关键技术研究[D]. 北京: 中国科学院大学,2020.

[11] Otsu N. A Threshold Selection Method from Gray-Level Histograms[J]. IEEE Transactions on Systems Man and Cybernetics, 2007, 9(1): 62-66.

[12] Abutaleb A S. Automatic Thresholding of Gray-Level Pictures Using Two-Dimensional Entropy[J]. Computer Vision Graphics & Image Processing, 1989, 47(1): 22-32.

[13] Deng-Yuan, Huang, et al. Optimal Multi-Level Thresholding Using a Two-Stage Otsu Optimization Approach[J]. Pattern Recognition Letters, 2009,30(3):275-284.

[14] Chassery J M , Garbay C . An Iterative Segmentation Method Based on a Contextual Color and Shape Criterion[J]. IEEE Trans Pattern Anal Mach Intell, 1984, 6(6):794-800.

[15] Horowitz S L, Pavlidis T. Picture Segmentation by a Tree Traversal Algorithm[J]. Journal of the Acm, 1976, 23(2): 368-388.

[16] 魏德强. 高分辨率遥感影像建筑物提取技术研究[D]. 郑州: 中国人民解放军信息工程大学, 2013.

[17] Rumelhart D E, M Cc Lelland J L. Parallel Distributed Processing: Explorations in the Microstructure of Cognition[J]. Science, 1986, 236: 992-997.

[18] Rumelhart D E, Hinton G E, Williams R J. Learning Representations by Back-Propagating Errors[J]. Nature, 1986, 323: 723-725.

[19] Lecun Y, Boser B, Denker J S, et al. Backpropagation Applied to Handwritten Zip Code Recognition[J]. Neural Computation, 1989, 1(4): 541-551.

[20] Krizhevsky A, Sutskever I, Hinton G E. Imagenet Classification with Deep Convolutional Neural Networks[J]. Advances in Neural Information Processing Systems, 2012, 25(2):1097-1105.

[21] Szegedy C, Liu W, Jia Y, et al. Going Deeper with Convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Boston, 2015: 1-9.

[22] Simonyan K, Zisserman A. Very Deep Convolutional Networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.

[23] He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]// IEEE Conference on Computer Vision & Pattern Recognition. IEEE, Boston, Computer Society, 2016.

[24] Long J, Shelhamer E, Darrell T. Fully Convolutional Networks for Semantic Segmentation[C]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, 39(4): 640-651.

[25] Badrinarayanan V, Kendall A, Cipolla R. Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.

[26] Ronneberger O, Fischer P, Brox T. U-net: Convolutional Networks for Biomedical Image Segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2015: 234-241.

[27] Yu F, Koltun V. Multi-Scale Context Aggregation by Dilated Convolutions[J]. arXiv preprint arXiv: 1511.07122, 2015.

[28] Chen L C, Panpandreou G, Kokkinos I, et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Conneted CRFs[J]. Computer Science, 2014(4):357-361.

[29] Chen L C, Panpandreou G, Kokkinos I, et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs[J].IEEE Transactions on Pattern Analysis & Machine Intelligence, 2016, 40(4): 834-848.

[30] Chen L C, Papandreou G, Schroff F, et al. Rethinking Atrous Convolution for Semantic Image Segmentation[J]. arXiv preprint arXiv: 1706.05587, 2017.

[31] Chen L C, Zhu Y, Papandreou G, et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 801-818.

[32] Zhao H, Shi J, Qi X, et al. Pyramid Scene Parsing Network[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2881-2890.

[33] Peng C, Zhang X, Yu G, et al. Large Kernel Matters-Improve Semantic Segmentation by Global Convolutional Network[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4353-4361.

[34] 张浩然, 赵江洪, 张晓光. 利用U-net网络的高分遥感影像建筑提取方法[J]. 遥感信息, 2020, 35(03): 143-150.

[35] Yao Hongtai et al. An Object-Based Markov Random Field with Partition-Global Alternately Updated for Semantic Segmentation of High Spatial Resolution Remote Sensing Image[J]. Remote Sensing, 2021, 14(1) : 127-127.

[36] 于坤, 王贺封, 焦月正, 李武乾. 基于语义分割的遥感影像建筑物提取[J]. 测绘与空间地理信息, 2021, 44(10): 50-54.

[37] 王鑫, 张昊宇, 凌诚. 基于U-Net优化的SAR遥感图像语义分割[J]. 计算机科学, 2021, 48(S2): 376-381.

[38] 王明常, 朱春宇, 陈学业, 王凤艳, 李婷婷, 张海明, 韩有文. 基于FPN Res-Unet的高分辨率遥感影像建筑物变化检测[J].吉林大学学报(地球科学版), 2021, 51(01): 296-306. DOI:10.13278/j.cnki.jjuese. 20190321.

[39] Howard A G, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, ArXiv preprint, 2017, ArXiv: 1704.04861.

[40] Zhang X, Zhou X, Lin M, Sun J. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices, ArXiv preprint, 2017, ArXiv: 1707.01083.

[41] Shah P , El-Sharkawy M. R-MnasNet: Reduced MnasNet for Computer Vision[C]// 2020 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS). IEEE, 2020: 1-5.

[42] 姚燕, 胡立坤, 郭军. 基于改进DeepLabv3+网络的轻量级语义分割算法[J]. 激光与光电子学进展, 2022, 59(04): 200-207.

[43] 徐世杰, 杜煜, 鹿鑫, 吴思凡. 基于ENet的轻量级语义分割算法研究[J]. 计算机工程与科学, 2021, 43(08): 1454-1460.

[44] Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W. CCNet: Criss-Cross Attention for Semantic Segmentation, ArXiv preprint, 2018, ArXiv: 1811.11721.

[45] 许启贤, 黄健, 李凡. 基于多任务学习的高光谱图像语义分割算法[J]. 中国科技论文, 2022, 17(03): 240-245+259.

[46] 司浚豪, 邵峰晶, 隋毅. 基于深度学习的遥感图像水边线提取方法与应用[J]. 海洋环境科学, 2022, 41(02): 309-315. DOI:10.13634/j.cnki.mes.2022.02.013.

[47] Huang H, Lin L, Tong R, et al. UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation[C]// ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020: 1055-1059.

[48] Tan M, Le Q. Efficientnet: Rethinking Model Scaling for Convolutional Neural Networks[C]//International conference on machine learning. PMLR, 2019: 6105-6114.

[49] Qiu Z, Yao T, Ngo C W, et al. Learning Spatio-Temporal Representation with Local and Global Diffusion[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 12056-12065.

中图分类号:

 TP391    

开放日期:

 2022-07-04    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式