论文中文题名: | 基于深度神经网络的图像修复算法研究 |
姓名: | |
学号: | 21208088026 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 083500 |
学科名称: | 工学 - 软件工程 |
学生类型: | 硕士 |
学位级别: | 工学硕士 |
学位年度: | 2024 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 图像处理 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2024-06-17 |
论文答辩日期: | 2024-05-30 |
论文外文题名: | Research on Image Inpainting Algorithm Based on Deep Neural Network |
论文中文关键词: | |
论文外文关键词: | Image Inpainting ; Adaptive Consistent Attention ; Fast Fourier Convolution ; Progressive Generative Adversarial Network ; Multi-level Feature Aggregation Network |
论文中文摘要: |
图像作为人类获取、传递以及保存信息的重要媒介之一,在众多领域被广泛应用。然而,由于存储和传输等因素影响,图像常常会出现破损,从而影响对其中信息的获取及后续研究。因此,图像修复已成为图像处理领域中备受关注的分支。图像修复是根据图像的已知信息来对缺失信息进行恢复,填充图像缺失像素,以实现整体语义结构一致性和视觉效果的真实性。目前基于深度学习的图像修复方法已经取得了不错的成绩,但是对于一些缺失面积较大的图像和背景结构复杂的高分辨率图像修复效果仍然存在语义真实性和一致性较差的问题。基于此,论文分别从有效提取特征信息和获取高感受野两个方面展开研究,主要包括以下内容: (1)针对目前不规则掩码的修复结果缺乏边缘一致性和语义正确性的问题,论文设计了一种渐进式生成对抗网络,该网络由边缘向中心进行递归修复,将修复好的特征作为下一次特征生成的条件,逐渐加强对中心内容的约束。为了获取远处信息以及考虑到直接在渐进修复网络中使用注意力可能出现不同递归下特征映射之间不一致的问题,论文设计了一种自适应一致性注意力模块,该模块自适应结合不同递归得到的分数从而捕获更多的特征信息。SN-PatchGAN判别器直接计算输出图上每个点的铰链损失,广泛关注到不同位置以及不同语义。实验结果表明,论文的方法在不规则掩码图像修复任务中表现良好,修复结果在边缘一致性、语义正确性、图像整体结构等方面都有更佳的表现,相比其他算法,分别在PSNR和SSIM指标上平均改进了2.21dB和0.147。 (2)针对高分辨率图像和具有复杂结构的图像存在修复结果不合理或者模糊的问题,论文设计了一种多级特征聚合网络,该网络从不同扩张率的卷积中提取特征,使得网络可以获得更多特征信息,从而恢复更合理的图像缺失内容。在生成网络中设计使用快速傅里叶卷积,使生成器在早期层考虑全局信息,获取全局感受野,从而有利于高分辨率图像修复任务。为了更好的训练生成网络,使用自导回归损失加大对缺失区域的惩罚力度,增强语义细节。实验结果表明,论文设计的方法在几何结构复杂和高分辨率图像修复任务中表现良好,可以提供更合理和清晰的修复结果。与其他几种经典的图像修复方法相比,分别在PSNR、SSIM和LPIPS指标上平均改进了2.69dB、0.042和0.022。 综上所述,论文设计的方法在CelebA、Places2等多个公共数据集上表现良好,渐进式生成对抗网络针对大面积缺失图像,逐步丰富缺失区域信息,修复结果语义明确,结构合理。多级特征聚合网络利用不同扩张率卷积感受野范围的差距,构建具有更大感受野的网络,对高分辨率复杂图像泛化效果良好。在后续研究中,论文将从平衡网络复杂度和视频修复等方面进行更广泛、更深入的尝试。 |
论文外文摘要: |
As one of the important media for human beings to acquire, transmit and preserve information, images are widely used in many fields. However, due to storage and transmission factors, images are often damaged, which affects the acquisition of information and subsequent research. Therefore, image inpainting has become a branch of image processing that has attracted much attention. Image inpainting is to recover the missing information according to the known information of the image and fill in the missing pixels of the image to achieve the consistency of the overall semantic structure and the authenticity of the visual effect. At present, image inpainting methods based on deep learning have achieved good results, but for some images with large missing areas and high-resolution images with complex background structures, the inpainting effect still suffers from poor semantic authenticity and consistency. Based on this, the thesis carries out research from two aspects of effectively extracting feature information and obtaining high sensory field respectively, which mainly includes the following contents: (1) Aiming at the current problem of lack of edge consistency and semantic correctness in the inpainting results of irregular masks, the paper designs a progressive generative adversarial network, which carries out recursive inpainting from the edges to the centre, takes the restored features as the conditions for the next feature generation, and gradually strengthens the constraints on the centre content. In order to capture distant information as well as considering the problem of inconsistency between feature mappings under different recursions that may occur when using attention directly in a progressive repair network, the paper designs an adaptive coherent attention module, which adaptively combines scores obtained from different recursions to capture more feature information. The SN-PatchGAN discriminator directly calculates the hinge loss at each point on the output graph, the Extensive attention is paid to different locations as well as different semantics. The experimental results show that the paper's method performs well in the irregular mask image inpainting task, and the inpainting results have better performance in terms of edge consistency, semantic correctness, and the overall structure of the image, and compared with other algorithms, it improves the PSNR and SSIM metrics by an average of 2.21dB and 0.147, respectively. (2) Aiming at the problem of unreasonable or blurred inpainting results for high-resolution images and images with complex structures, the paper designs a multilevel feature aggregation network, which extracts features from convolutions with different dilatation rates, making it possible for the network to obtain more feature information to recover more reasonable missing content of the image. The use of Fast Fourier Convolution is designed in the generative network so that the generator considers the global information in early layers and acquires the global receptive field, which facilitates the task of high-resolution image inpainting. For better training of the generative network, the use of self-guided regression loss increases the penalty for missing regions and enhances the semantic details. Experiments show that the method designed in the paper performs well in geometrically complex and high-resolution image inpainting tasks, providing more reasonable and clearer inpainting results. Compared with several other classical image inpainting methods, an average improvement of 2.69 dB, 0.042 and 0.022 is achieved in PSNR, SSIM and LPIPS metrics, respectively. In summary, the method designed by the paper performs well on several public datasets such as CelebA and Places2, and the progressive generative adversarial network targets large missing images, gradually enriches the information of missing regions, and the inpainting results are semantically clear and structurally sound. The multilevel feature aggregation network uses the disparity of the range of convolutional receptive fields with different expansion rates to construct a network with a larger receptive field, which has good generalization effect on high-resolution complex images. In the subsequent research, the paper will make more extensive and in-depth attempts from the aspects of smoothing network complexity and video inpainting. |
参考文献: |
[2] 刘颖, 佘建初, 公衍超, 等. 基于深度学习的面部修复技术综述[J]. 计算机应用研究, 2021, 38(1): 1-6. [4] 赵露露, 沈玲, 洪日昌. 图像修复研究进展综述[J]. 计算机科学, 2021, 48(3): 1-13. [6] 钟菲, 杨斌. 一种新型的基于深度学习的单幅图像去雨方法[J]. 计算机科学, 2018, 45(11): 283-287. [10] 李海燕, 黄和富, 郭磊, 李海江, 陈建华, 李红松. 基于残缺图像样本的生成对抗网络图像修复方法[J]. 北京航空航天大学学报, 2021, 047(010): 1949-1958. [14] 刘媛媛, 彭浩, 代宇婷, 等. 基于深度学习的图像修复研究进展[J]. 软件导刊, 2023, 22(7): 220-226. [16] 吕建峰, 邵立珍, 雷雪梅. 基于深度神经网络的图像修复算法综述[J]. 计算机工程与应用, 2023, 59(20): 1-12. [36] 范春奇, 任坤, 孟丽莎, 等. 基于深度学习的数字图像修复算法最新进展[J]. 信号处理, 2020(1): 1-8. [38] 彭进业, 余喆, 屈书毅, 等. 基于深度学习的图像修复方法研究综述[J]. 西北大学学报(自然科学版), 2023(6): 943-963. [40] 刘科研, 周方泽, 周晖. 基于时序信号图像编码和生成对抗网络的配电网台区数据修复[J]. 电力系统保护与控制, 2022, 50(24): 129-136. [44] 孙劲光, 杨忠伟, 黄胜. 全局与局部属性一致的图像修复模型[J]. 中国图象图形学报, 2020, 25(12): 1-12. [46] 白宗文, 弋婷婷, 周美丽, 等. 基于多尺度特征融合的人脸图像修复方法[J]. 计算机工程, 2021, 47(5): 1-9. [47] 范新刚. 基于深度学习的图像修复技术研究[J]. 江苏科技信息, 2020, 37(8): 1-3. [48] 布安旭, 马驰, 胡辉, 等. 基于双判别器生成对抗网络的遮挡人脸图像修复算法[J]. 计算机与数字工程, 2023, 51(4): 910-915. [49] 李雪涛, 王耀雄, 高放. 图像修复方法综述[J]. Laser & Optoelectronics Progress, 2023, 60(2): 1-16. [51] 李天成, 何嘉. 一种基于生成对抗网络的图像修复算法[J]. 计算机应用与软件, 2019, 36(12): 195-200. [52] 李雪瑾, 李昕, 徐艳杰. 基于生成对抗网络的数字图像修复技术[J].电子测量与仪器学报, 2019(1): 1-7. [54] 黄健, 韩俊楠. 基于生成对抗网络的图像修复算法[J]. 计算机系统应用, 2023, 32(10): 215-221. [65] 孟丽莎, 任坤, 范春奇, 等.基于密集卷积生成对抗网络的图像修复[J]. 计算机科学, 2020, 47(8): 1-6. [67] 陈俊周, 王娟, 龚勋. 基于级联生成对抗网络的人脸图像修复[J]. 电子科技大学学报, 2019, 48(6): 1-8. |
中图分类号: | TP391 |
开放日期: | 2024-06-17 |