- 无标题文档
查看论文信息

论文中文题名:

 基于深度学习的遥感图像目标检测算法研究    

姓名:

 田锦    

学号:

 21207223059    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 085400    

学科名称:

 工学 - 电子信息    

学生类型:

 硕士    

学位级别:

 工学硕士    

学位年度:

 2024    

培养单位:

 西安科技大学    

院系:

 通信与信息工程学院    

专业:

 电子信息    

研究方向:

 计算机视觉    

第一导师姓名:

 武风波    

第一导师单位:

 西安科技大学    

论文提交日期:

 2024-06-13    

论文答辩日期:

 2024-06-01    

论文外文题名:

 Research on Object Detection Algorithm for Remote Sensing Images Based on Deep Learning    

论文中文关键词:

 光学遥感图像 ; 目标检测 ; 深度学习 ; YOLOv8 ; 注意力机制 ; 轻量化    

论文外文关键词:

 Optical remote sensing image ; Target detection ; Deep learning ; YOLOv8 ; Attention mechanism ; Lightweight    

论文中文摘要:

随着遥感图像在军事侦察、环境监测和城市规划等领域的广泛应用,基于光学遥感图像的目标检测任务成为当前的研究热点。遥感图像中的目标具有背景信息复杂、排列密集和尺寸变化大等特点,且基于深度学习的模型结构复杂,在资源受限的设备上部署困难,这给遥感图像的目标检测任务带来了挑战。本文选择YOLOv8算法为基础网络,针对上述问题进行优化,具体工作内容如下:

(1)针对遥感图像中目标背景信息复杂和尺度变化大的特点,提出一种基于多尺度特征提取的遥感图像目标检测方法。首先,设计一种多尺度特征提取结构重构网络中的C2f模块,增强网络对特征图中不同尺度信息的提取能力;其次,在网络的检测头前添加MSCA注意力机制,抑制图像中复杂背景噪声的干扰,增强网络对有效信息的关注度;最后引入KFIoU损失函数,更好地拟合旋转边界框的回归过程,进一步提升预测框的回归精度。实验结果表明,改进算法相比于原算法在DIOR-R数据集上的精度提升了2.1%。

(2)针对遥感图像检测设备资源受限的问题,在上述改进算法的基础上进行轻量化改进。首先设计了一种基于共享卷积和组归一化思想的轻量化检测头,优化原检测头使用三个分支分别对目标的位置、类别和角度信息进行预测产生的参数量大的问题;其次,使用Ghost卷积对网络中的Backbone和Neck部分做出改进,进一步降低模型的内存需求,实现更高效的资源利用;最后,在网络的特征融合模块添加跳跃连接结构,使网络捕获到更丰富的语义信息,从而缓解因参数量减少造成的算法精度下降的问题。实验结果表明,改进的轻量化模型在保持较高检测性能的情况下,参数量下降了41.3%,计算量下降了39.4%。

本文提出的基于深度学习的遥感图像目标检测算法充分发挥了深度学习技术挖掘图像深层特征的优势,提升了遥感图像的自动化处理能力,为遥感图像的广泛应用提供更加可靠的技术支持,同时对于其他复杂场景下的目标检测也有一定的借鉴意义。

论文外文摘要:

With the widespread application of remote sensing images in military reconnaissance, environmental monitoring, urban planning, and other fields, the task of target detection based on optical remote sensing images has become a current research hotspot. Targets in remote sensing images possess characteristics such as complex background information, dense arrangement, and large size variations. Moreover, models based on deep learning have complex structures and are difficult to deploy on resource-constrained devices, posing challenges for target detection tasks in remote sensing images. In this paper, the YOLOv8 algorithm is chosen as the base network to optimize the above-mentioned issues. The specific work is as follows:

(1) To address the complexity of background information and large-scale variations in targets within remote sensing images, a remote sensing image target detection method based on multi-scale feature extraction is proposed. Firstly, a multi-scale feature extraction structure is designed to reconstruct the C2f module in the network, enhancing the network's ability to extract information from features maps of different scales. Secondly, an MSCA attention mechanism is added before the detection head of the network to suppress the interference of complex background noise in the image and enhance the network's focus on effective information. Finally, a KFIoU loss function is introduced to further improve the regression accuracy of predicted boxes. Experimental results show that the improved algorithm achieves a 2.1% increase in accuracy compared to the original algorithm on the DIOR-R dataset.

(2) To address the issue of resource constraints in remote sensing image detection devices, lightweight optimization is conducted based on the improved algorithm mentioned above. Firstly, a lightweight detection head based on the idea of shared convolution and group normalization is designed to optimize the problem of large parameterization caused by the original detection head predicting the position, category, and angle information of targets using three separate branches. Secondly, improvements are made to the Backbone and Neck parts of the network using Ghost convolution to further reduce the model's memory requirements, achieving more efficient resource utilization. Finally, a skip connection structure is added to the network's feature fusion module to capture richer semantic information, thereby alleviating the decrease in network accuracy caused by the reduction in parameter quantity. Experimental results show that the improved lightweight model achieves a 41.3% reduction in parameters and a 39.4% reduction in computational complexity while maintaining high detection performance.

The deep learning-based target detection algorithm proposed in this paper fully leverages the advantages of deep learning technology in exploring deep features, enhancing the automation capability of remote sensing image processing. It provides more reliable technical support for the widespread application of remote sensing images and serves as a reference for target detection and recognition tasks in other complex scenarios.

参考文献:

秦伟伟,宋泰年,刘洁瑜,等.基于轻量化YOLOv3的遥感军事目标检测算法[J].计算机工程与应用,2021,57(21):263-269.

马本昌,孙德勇,李正浩,等.基于CIE色度角的海洋水质环境监测方法[J].海洋科学进展,2023,41(01):135-147.

谢虹波,费强,谢新旺,等.超宽幅遥感转扫成像分析[J].液晶与显示,2022,37(09):1209-1215.

Liu Z, Wang H, Weng L, et al. Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds[J]. IEEE geoscience and remote sensing letters, 2016, 13(8): 1074-1078.

Zhang L, Zhang Y. Airport detection and aircraft recognition based on two-layer saliency model in high spatial resolution remote-sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2016, 10(4): 1511-1524.

Xu F, Liu J. Ship detection and extraction using visual saliency and histogram of oriented gradient[J]. Optoelectronics Letters, 2016, 12(6): 473-477.

Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision, 2004, 60: 91-110.

Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//2005 IEEE computer society conference on computer vision and pattern recognition. 2005, 1: 886-893.

Oliva A, Torralba A. Modeling the shape of the scene: A holistic representation of the spatial envelope[J]. International journal of computer vision, 2001, 42: 145-175.

Breiman L. Random forests[J]. Machine learning, 2001, 45: 5-32.

王润民,桑农,丁丁,等.自然场景图像中的文本检测综述[J].自动化学报,2018,44(12):2113-2141.

Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25: 1106-1114.

Girshick R. Fast r-cnn[C]//2015 IEEE International Conference on Computer Vision. 2015: 1440-1448.

Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28.

Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.

Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.

Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.

Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.

Wang C Y, Bochkovskiy A, Liao H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 7464-7475.

Liu Wei, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector[C] // Proc of the 14th European Conf on Computer Vision. Berlin: Springer, 2016: 21-37

Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2980-2988.

付涵,范湘涛,严珍珍,等.基于深度学习的遥感图像目标检测技术研究进展[J].遥感技术与应用,2022,37(2):290-305.

Ghiasi G, Lin T Y, Le Q V. Nas-fpn: Learning scalable feature pyramid architecture for object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 7036-7045.

Tan M, Pang R, Le Q V. Efficientdet: Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 10781-10790.

Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 8759-8768.

Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017: 936-944.

Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision. 2018: 3-19.

Yang L, Zhang R Y, Li L, et al. Simam: A simple, parameter-free attention module for convolutional neural networks[C]//International conference on machine learning. PMLR, 2021: 11863-11874.

Guo M H, Lu C Z, Hou Q, et al. Segnext: Rethinking convolutional attention design for semantic segmentation[J]. Advances in Neural Information Processing Systems, 2022, 35: 1140-1156.

孟月波,王菲,刘光辉,等.多元特征提取与表征优化的遥感多尺度目标检测[J].光学精密工程,2023,31(16):2465-2482.

韩兴勃,李凡.基于跨层注意力增强的遥感小目标检测[J].激光与光电子学进展,2023,60(12):462-470.

Qu J, Tang Z, Zhang L, et al. Remote Sensing Small Object Detection Network Based on Attention Mechanism and Multi-Scale Feature Fusion[J]. Remote Sensing, 2023, 15(11): 2728.

李超,王凯,丁才昌,等.改进特征融合网络的遥感图像小目标检测[J].计算机工程与应用,2023,59(17):232-241.

Huo B, Li C, Zhang J, et al. SAFF-SSD: Self-Attention Combined Feature Fusion-Based SSD for Small Object Detection in Remote Sensing[J]. Remote Sensing, 2023, 15(12): 3027.

Han J, Ding J, Li J, et al. Align deep features for oriented object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-11.

Han J, Ding J, Xue N, et al. Redet: A rotation-equivariant detector for aerial object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 2786-2795.

Yang X, Yan J C. Arbitrary-Oriented Object Detection with Circular Smooth Label[C]//Computer Vision-ECCV 2020. Cham: Springer Inter⁃national Publishing, 2020: 677-694.

Yang X, Yan J, Ming Q, et al. Rethinking rotated object detection with gaussian wasserstein distance loss[C]//International conference on machine learning. PMLR, 2021: 11830-11841.

Yang X, Yang X, Yang J, et al. Learning high-precision bounding box for rotated object detection via kullback-leibler divergence[J]. Advances in Neural Information Processing Systems, 2021, 34: 18381-18394.

Llerena J M, Zeni L F, Kristen L N, et al. Probabilistic Intersection-Over-Union for Training and Evaluation of Oriented Object Detectors[J]. IEEE Transactions on Image Processing, 2024.

Yang X, Zhou Y, Zhang G F, et al. The KFIoU loss for rotated object detection[EB/OL]. (2022-01-29)[2023-11-09]. http://arxiv.org/abs/2201.12558.

秦伟伟,宋泰年,刘洁瑜,等.基于轻量化YOLOv3的遥感军事目标检测算法[J].计算机工程与应用,2021,57(21):263-269.

张鹏程,武文波,李强,等.面向星载边缘计算的遥感目标检测算法轻量化优化研究[J].空间控制技术与应用,2022,48(05):86-94.

张廓,陈章进,乔栋,等.基于感受野和特征增强的遥感图像实时检测[J].激光与光电子学进展,2023,60(02):331-340.

郎磊,夏应清.紧凑的神经网络模型设计研究综述[J].计算机科学与探索,2020,14(09):1456-1470.

林景栋,吴欣怡,柴毅,等.卷积神经网络结构优化综述[J].自动化学报,2020,46(01):24-37.

Hahnloser R H R, Sarpeshkar R, Mahowald M A, et al. Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit[J]. nature, 2000, 405(6789): 947-951.

Han J, Moraga C. The influence of the sigmoid function parameters on the speed of backpropagation learning[C]//International workshop on artificial neural networks. Berlin, Heidelberg: Springer Berlin Heidelberg, 1995: 195-201.

陈旭,张军,陈文伟,等.卷积网络深度学习算法与实例[J].广东工业大学学报,2017,34(06):20-26.

Yu J, Jiang Y, Wang Z, et al. Unitbox: An advanced object detection network[C]//Proceedings of the 24th ACM international conference on Multimedia. 2016: 516-520.

Zheng Z, Wang P, Liu W, et al. Distance-IoU loss: Faster and better learning for bounding box regression[C]//Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 12993-13000.

Zhang Y F, Ren W, Zhang Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146-157.

He J, Erfani S, Ma X, et al. α-IoU: A family of power intersection over union losses for bounding box regression[J]. Advances in Neural Information Processing Systems, 2021, 34: 20230-20242.

Lin T Y, Maire M, Belongie S, et al. Microsoft coco: Common objects in context[C]//Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. 2014: 740-755.

Han K, Wang Y, Tian Q, et al. Ghostnet: More features from cheap operations[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 1580-1589.

Cheng G, Wang J, Li K, et al. Anchor-free oriented proposal generator for object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-11.

刘敏豪,王堃,金睿蛟,等.基于改进RoI Transformer的遥感图像多尺度旋转目标检测[J].应用光学,2023,44(05):1010-1021.

王友伟,郭颖,邵香迎,等.基于特征重组的遥感图像有向目标检测[J].光学学报,2024,44(06):326-336.

Tian Z, Shen C, Chen H, et al. FCOS: A simple and strong anchor-free object detector[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 44(4): 1922-1933.

中图分类号:

 TP391.4    

开放日期:

 2024-06-13    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式