论文中文题名: | 基于深度学习的图像修复算法研究 |
姓名: | |
学号: | 20208223053 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 085400 |
学科名称: | 工学 - 电子信息 |
学生类型: | 硕士 |
学位级别: | 工程硕士 |
学位年度: | 2023 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 数字图像处理 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2023-06-15 |
论文答辩日期: | 2023-06-05 |
论文外文题名: | Research on Image Inpainting Method Based on Deep Learning |
论文中文关键词: | 图像修复 ; 门控卷积 ; 自注意力机制 ; 双阶段修复网络 ; Transformer |
论文外文关键词: | Image Inpainting ; Gated Convolution ; Self-attention Mechanism ; Two-stage Inpainting Network ; Transformer |
论文中文摘要: |
图像是目前人类用于传递和获取信息频率最高的媒介之一,图像信息的破损会影响对信息的获取以及后续处理。破损图像修复是计算机视觉中的一个重要研究领域,它的目标是修复破损图像中的缺损信息。近年来,随着深度学习的兴起,基于深度学习的图像修复方法取得了显著效果,但是在一些情况下,修复图像仍然存在着模糊失真、语义信息不匹配和训练缓慢等问题。为了解决以上问题,本文的研究内容如下: (1)针对目前轻量化图像修复方法存在的语义信息缺失、修复内容不匹配和训练缓慢的问题,本文提出一种基于门控卷积和自注意力机制的金字塔图像修复方法。首先,该方法以U型网络为基础,融合门控卷积,改变特征的提取策略,减少对冗余信息的计算,提高模型的计算效率。其次,设计自注意力机制模块和注意力转移模块,更有效地引导高阶语义特征与图像信息之间的转换过程,减少网络中长距离导致的信息损耗。最后,设计并增加内容损失、感知损失和金字塔损失,增强网络的学习速度和能力,生成和真实图像相近的数据分布。实验结果表明基于门控卷积和自注意力机制的金字塔图像修复方法修复结果语义更完整、内容更匹配,模型训练速度更快。 (2)针对应用于高质量图像修复的方法中存在的修复后图像语义信息和边缘一致性较差、图像清晰度低、细节丢失和模型训练缓慢的问题,本文提出一种基于自适应Transformer的高质量图像修复方法。首先,设计双阶段生成器网络,第一阶段是结合自注意力机制的卷积神经网络,第二阶段采用基于Transformer的生成器模型。通过使用双阶段网络,增强模型的修复能力。其次,设计使用自适应多头自注意力机制,增加对核心特征区域的关注度,加快模型的前向传播速度。最后,融合内容损失、感知损失和金字塔损失作为生成器网络的损失函数,提高模型的训练速度和学习精度。实验结果表明使用基于自适应Transformer的高质量图像修复方法增加了修复图像的清晰度和纹理细节,语义信息和边缘一致性更高,模型的训练速度更快。 |
论文外文摘要: |
Images are one of the most frequently used media for conveying and acquiring information in today's world, and damage to image information can affect the retrieval of information and subsequent processing. Image inpainting is an important research area in computer vision, with the goal of repairing missing information in damaged images. In recent years, deep learning-based image inpainting methods have achieved remarkable results. However, in some cases, the inpainted images still suffer from blurring, distortion, semantic mismatch, and slow training. To address these issues, the research contents in this paper are as follows: (1) To address the issues of semantic information loss, content mismatch, and slow training in current lightweight image inpainting methods, this paper proposes a pyramid image inpainting method based on gated convolution and self-attention mechanisms. Firstly, the method is based on a UNet and incorporates gated convolution, changing feature extraction strategies to reduce computation on redundant information and improve the model's computational efficiency. Secondly, self-attention mechanism module and attention transfer module are designed to more effectively guide the transformation process between high-level semantic features and image information, reducing the information loss caused by long distance within the network. Lastly, content loss, perceptual loss, and pyramid loss are designed and added to enhance the network's learning speed and capacity, generating data distributions close to the real images. Experimental results show that the pyramid image inpainting method based on gated convolution and self-attention mechanisms yields more complete semantics, better content matching, and faster model training. (2) To address the problems of poor semantic information consistency, low image clarity, detail loss, and slow model training in high-quality image inpainting methods, this paper proposes a high-quality image inpainting method based on adaptive Transformer. Firstly, a two-stage generator network is designed, the first stage is a convolutional neural network combined with a self-attention mechanism, and the second stage is a generator model based on Transformer. By using a two-stage network, the model's inpainting capability is enhanced. Secondly, an adaptive multi-head self-attention mechanism is employed to increase the focus on core feature areas and accelerate the model's forward propagation speed. Lastly, content loss, perceptual loss, and pyramid loss are integrated as the generator network's loss functions, improving the model's training speed and learning accuracy. Experimental results show that the high-quality image inpainting method based on adaptive Transformer increases the clarity and texture detail of the inpainted images, and achieves higher semantic information and edge consistency while providing faster model training. |
参考文献: |
[20] 朱张莉, 饶元, 吴渊 等. 注意力机制在深度学习中的研究进展[J]. 中文信息学报, 2019, 33(6): 1-11. [27] 司念文, 张文林, 屈丹 等. 卷积神经网络表征可视化研究综述[J]. 自动化学报, 2022, 48(8): 1890-1920. [28] 窦慧, 张凌茗, 韩峰 等. 卷积神经网络的可解释性研究综述[J]. 软件学报, 2023: 1-27. [30] 刘荣. 人工神经网络基本原理概述[J]. 计算机产品与流通, 2020, 6: 35-36. [44] 李雪涛, 王耀雄, 高放. 图像修复方法综述[J]. Laser & Optoelectronics Progress, 2023, 60(2): 1-16. [45] 李天成, 何嘉. 一种基于生成对抗网络的图像修复算法[J]. 计算机应用与软件, 2019, 36(12): 195-200. |
中图分类号: | TP391 |
开放日期: | 2023-06-15 |