查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于深度学习的小样本目标检测算法研究
姓名：	阮小曼
学号：	21207223102
保密级别：	公开
论文语种：	chi
学科代码：	085400
学科名称：	工学 - 电子信息
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2024
培养单位：	西安科技大学
院系：	通信与信息工程学院
专业：	电子信息
研究方向：	深度学习
第一导师姓名：	朱代先
第一导师单位：	西安科技大学
论文提交日期：	2024-06-13
论文答辩日期：	2024-06-04
论文外文题名：	Few-shot Object Detection Algorithm based on Deep Learning
论文中文关键词：	目标检测 ; 小样本目标检测 ; 多尺度特征融合 ; 注意力机制
论文外文关键词：	Object detection ; Few-shot object detection ; Multi-scale feature fusion ; Attention mechanism
论文中文摘要：	︿近年来，基于深度学习的目标检测方法已经取得了显著的成果，其目的是从图像中准确识别和定位特定目标。这类方法通常依赖每个对象类别的大规模标记训练样本来确保检测效果。然而，在实际应用场景中，通常难以获取充足的标注数据。针对这一问题，研究人员提出了基于深度学习的小样本目标检测方法，旨在利用有限的标注数据，对图像中的目标进行有效分类和精准定位，弥补了目前目标检测算法的不足，是十分具有研究价值的。本文针对小样本目标检测算法识别准确率较低和定位不精准的问题，在FSCE 算法的基础上提出了 ARP-FSOD 算法，主要研究内容如下：（1）针对少量样本情况下特征提取不足和区域建议网络候选框质量低的问题，提出了一种基于多重注意力机制的小样本目标检测 AR-FSOD 方法。首先，提出了改进的特征提取网络 SE-Res2Net，通过通道分组提取图像细粒度的多尺度特征，以获取多个不同粒度的感受野，并引入了通道优化模块以提高通道相关性。其次，在特征提取网络与 RPN 之间引入混合注意模块，优化支持集与查询集的特征表达，将优化后的特征输入注意力RPN，生成更准确的目标候选框来提升定位能力以及检测精度。Pascal VOC数据集上的实验结果表明，本文方法相较于 FSCE 能够获取与目标类别更相关的候选框，并且提高最终检测精度。（2）针对小样本目标检测算法中信息利用不充分以及分类混淆的问题，在前文的基础上提出了基于加权类别原型分支的小样本目标检测 ARP-FSOD 方法。首先，提出改进的双路跨层特征金字塔网络 PRFPN，引入跨层特征融合机制，使深层特征与浅层特征实现更有效地融合。其次，引入加权类别原型分支，采用类别原型度量思想对类别原型进行特征加权，通过加权原型对比损失函数使得同类样本的特征更为紧密地聚类在一起，提高分类的准确性。Pascal VOC 和 MSCOCO 数据集上实验结果显示，采用上述改进的算法能够修正类别混淆问题，本文方法相较于 FSCE 在 Pascal VOC 数据集上有平均有 2.77%的检测精度提升，在 MSCOCO 数据集上平均有 1.8%的检测精度提升。（3）本文将 ARP-FSOD 算法应用至濒危物种检测领域，以解决濒危物种检测样本数量稀少导致检测效果不佳的问题。在自建的濒危物种数据集 EAD 上验证了 ARP-FSOD 的泛化能力与应用价值，结果表明本文算法不仅提高了检测精度，还能有效减少漏检的情况。﹀
论文外文摘要：	︿ In recent years, significant results have been achieved by deep learning-based object detection methods that aim to accurately identify and localize specific targets from images. Such methods usually rely on large-scale labeled training samples for each object category to ensure the detection effect. However, in practical application scenarios, it is usually difficult to obtain sufficient labeled data. To address this problem, researchers have proposed a few-shot object detection based on deep learning, which aims to effectively classify and accurately localize objects in images using limited labeled data, making up for the shortcomings of the current object detection algorithms, and is of great research value. In this thesis, for the problems of low recognition accuracy and imprecise localization of few-shot object detection algorithm, ARPFSOD algorithm is proposed on the basis of FSCE algorithm, and the main research contents are as follows: (1) Aiming at the problems of insufficient feature extraction in the case of a small number of samples and the low quality of candidate frames of the area suggestion network, an AR-FSOD method for few-shot object detection based on multiple attention mechanism is proposed. First, an improved feature extraction network, SE-Res2Net, is proposed to extract image fine-grained multiscale features by channel grouping to obtain multiple receptive fields with different granularities, and a channel optimization module is introduced to improve the channel correlation. Second, a hybrid attention module is introduced between the feature extraction network and the RPN to optimize the feature expression of the support set and the query set, and the optimized features are input to the attention RPN to generate more accurate target candidate frames to improve the localization ability and detection accuracy. Experimental results on the Pascal VOC dataset show that the method in this thesis is able to obtain candidate frames that are more relevant to the target category than FSCE, and improve the final detection accuracy. and improve the final detection accuracy. (2) Aiming at the problems of under-utilization of information as well as classification confusion in few-shot object detection tasks, an ARP-FSOD method for few-shot object detection based on weighted category prototype branching is proposed on the basis of the previous thesis. First, an improved feature pyramid network PRFPN is proposed to introduce a cross-layer feature fusion mechanism to achieve more effective fusion of deep features with shallow features. Second, the weighted category prototype branch is introduced, and the category prototype metric idea is used to weight the features of the category prototypes, which makes the features of similar samples more closely clustered together through the weighted prototype comparison loss function, and improves the accuracy of classification. The experimental results on the Pascal VOC and MSCOCO datasets show that the above improved algorithms are able to correct the category confusion problem, and the method in this thesis is better than FSCE in the Pascal VOC and MSCOCO datasets. Compared with FSCE, the method in this thesis has an average detection accuracy improvement of 2.77% on the Pascal VOC dataset and 1.8% on the MSCOCO dataset. (3) In this thesis, the ARP-FSOD algorithm is applied to the field of endangered species detection to solve the problem of poor detection results due to the sparse number of samples for endangered species detection. The generalization ability and application value of ARP-FSOD are verified on the self-constructed endangered species dataset EAD, and the results show that the algorithm in this thesis not only improves the detection accuracy, but also effectively reduces the missed detections. ﹀
中图分类号：	TP391.41
开放日期：	2024-06-14

附件下载