论文中文题名: | 基于卷积神经网络的图像标注算法研究 |
姓名: | |
学号: | 18208088014 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 083500 |
学科名称: | 工学 - 软件工程 |
学生类型: | 硕士 |
学位级别: | 工学硕士 |
学位年度: | 2021 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 人工智能与信息处理 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2021-06-21 |
论文答辩日期: | 2021-06-03 |
论文外文题名: | Research on Image Annotation Algonthm Based on Convolutional Neural Network |
论文中文关键词: | |
论文外文关键词: | Automatic Image Annotation ; Deep Learning ; Convolutional Neural Network ; Feature Fusion ; Generative Adversarial Networks |
论文中文摘要: |
随着智能手机、家用电脑等数码设备的普及和通信技术的发展,图像等可视化数据在互联网的共享平台上随处可见,为了对其进行有效的管理和使用,研究者们提出了图像检索技术。但由于技术限制和用户的习惯,搜索引擎都提供基于关键词的图像检索,这种检索方式需要提前使用关键词对图像进行标注,但仅依靠手工的方式进行标注时间成本和人工成本是难以想象的,因此图像自动标注技术迅速发展起来。传统的图像自动标注算法由于模型复杂且泛化性能差,存在标注结果准确率低等缺点,为此,本文提出了两种基于卷积神经网络的图像自动标注算法。主要工作如下: (1)针对图像中小尺度目标标注准确率低和标注的类别不均衡的问题,提出了融合多尺度特征和代价敏感学习的图像标注方法。该方法对VGG16的网络结构进行了调整,添加了特征融合模块。特征融合模块分为多尺度特征提取和特征融合。多尺度特征提取模块从卷积特征提取多尺度特征,特征融合模块在网络学习过程中自适应的融合特征,并在多标签损失函数的基础上提出了代价敏感的多标签损失函数。实验表明,融合多尺度特征和代价敏感学习的图像标注算法能够在保证高频标签标注性能的同时,提升对低频标签的标注性能。 (2)针对图像标注数据集存在的训练样本不充足和标注的类别不均衡的问题,设计了基于双卷积神经网络的图像标注方法。首先提出了基于生成对抗网络的图像扩充方法,与传统图像扩充方法相结合解决训练样本不充足问题;其次改进卷积神经网络结构,引入可形变卷积和滤波池化来加强对不同尺度对象的标注能力;最后对数据集进行划分,划分为全部数据集和低频标签数据集,分别独立训练两个卷积神经网络模型,并设计标注结果融合模块对两个模型标注结果进行融合,低频数据集训练出来的模型更适用于标注低频标签,降低了类别不平衡对低频标签的影响。实验表明,基于双卷积神经网络模型的图像标注算法能够提升图像标注的准确率。 |
论文外文摘要: |
With the popularity of smart phones, home computers and other digital devices and the development of communication technology, visual data such as images can be seen everywhere on the Internet sharing platform. In order to manage and use them effectively, researchers put forward image retrieval technology. Due to technical limitations and user habits, search engines provide keyword based image retrieval. This retrieval method needs to use keywords to annotate the image in advance, but it is difficult to imagine the time cost and labor cost only relying on manual annotation, so the automatic image annotation technology has developed rapidly. Due to the complexity of the model, poor generalization performance and low accuracy of the traditional image automatic annotation algorithm, this paper proposes two image automatic annotation algorithms based on convolution neural network. The main work is as follows: (1) Aiming at the problems of low accuracy and imbalanced categories of small and medium scale object annotation, an image annotation method based on multi-scale features and cost sensitive learning is proposed. This method adjusts the network structure of vgg16 and adds feature fusion module. The feature fusion module is divided into multi-scale feature extraction module and fusion feature module. The multi-scale feature extraction module extracts multi-scale features from convolution features, and the fusion feature module adaptively fuses features in the process of network learning. Based on the multi label loss function, a cost sensitive multi label loss function is proposed. Experimental results show that the proposed algorithm can improve the labeling performance of low-frequency tags while ensuring the labeling performance of high-frequency tags. (2) In order to solve the problems of insufficient training samples and unbalanced annotation categories in image annotation dataset, an image annotation method based on double convolution neural network is designed. Firstly, an image expansion method based on generative countermeasure network is proposed, which is combined with the traditional image expansion method to solve the problem of insufficient training samples. Secondly, the convolution neural network structure is improved, and deformable convolution and filter pooling are introduced to enhance the ability of labeling objects of different scales. Finally, the dataset is divided into all datasets and low-frequency label datasets, respectively two convolutional neural network models are trained independently, and a labeling result fusion module is designed to fuse the labeling results of the two models. The model trained from low-frequency dataset is more suitable for labeling low-frequency labels, which reduces the impact of class imbalance on low-frequency labels. Experiments show that the image annotation algorithm based on double convolution neural network model can improve the accuracy of image annotation. |
参考文献: |
[2]贺周雨, 冯旭鹏, 刘利军, 黄青松. 面向大规模图像检索的深度强相关散列学习方法[J]. 计算机研究与发展, 2020, 57(11):2375-2388. [6]严靓, 周欣, 何小海, 熊淑华, 卿粼波. 基于集成分类的暴恐图像自动标注方法[J]. 太赫兹科学与电子信息学报,2020, 18(2):306-312. [7]张钢, 钟灵, 黄永慧. 一种病理图像自动标注的机器学习方法[J]. 计算机研究与发展, 2015, 52(9):2135-2144. [14]秦铭, 蔡明. 基于分类融合和关联规则挖掘的图像语义标注[J]. 计算机工程与科学, 2018, 40(5):950-956. [18]李志欣, 施智平, 张灿龙, 王金艳. 混合生成式和判别式模型的图像自动标注[J]. 中国图象图形学报, 2015, 20(5):687-699. [26]柯逍, 周铭柯, 牛玉贞. 融合深度特征和语义邻域的自动图像标注[J]. 模式识别与人工智能, 2017, 30(3):193-203. [28]李志欣, 郑永哲, 张灿龙, 史忠植. 结合深度特征与多标记分类的图像语义标注[J].计算机辅助设计与图形学学报, 2018, 30(2):318-326. [31]高耀东, 侯凌燕, 杨大利. 基于多标签学习的卷积神经网络的图像标注方法[J]. 计算机应用, 2017, 37(1):228-232. [32]汪鹏, 张奥帆, 王利琴, 董永峰. 基于迁移学习与多标签平滑策略的图像自动标注[J]. 计算机应用, 2018, 38(11):3199-3203+3210. [42]周铭柯, 柯逍, 杜明智. 基于数据均衡的增进式深度自动图像标注[J]. 软件学报, 2017, 28(7):1862-1880. [44]Bbeiman L. Bagging predictors[J]. Machine Learning, 1996, 24(2):123-140. |
中图分类号: | TP301.6 |
开放日期: | 2021-06-21 |