- 无标题文档
查看论文信息

论文中文题名:

 基于深度神经网络的图像修复算法研究    

姓名:

 胡柳青    

学号:

 21208088026    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 083500    

学科名称:

 工学 - 软件工程    

学生类型:

 硕士    

学位级别:

 工学硕士    

学位年度:

 2024    

培养单位:

 西安科技大学    

院系:

 计算机科学与技术学院    

专业:

 软件工程    

研究方向:

 图像处理    

第一导师姓名:

 李洪安    

第一导师单位:

 西安科技大学    

论文提交日期:

 2024-06-17    

论文答辩日期:

 2024-05-30    

论文外文题名:

 Research on Image Inpainting Algorithm Based on Deep Neural Network    

论文中文关键词:

 图像修复 ; 自适应一致性注意力 ; 傅里叶卷积 ; 渐进式生成对抗网络 ; 多级特征聚合网络    

论文外文关键词:

 Image Inpainting ; Adaptive Consistent Attention ; Fast Fourier Convolution ; Progressive Generative Adversarial Network ; Multi-level Feature Aggregation Network    

论文中文摘要:

图像作为人类获取、传递以及保存信息的重要媒介之一,在众多领域被广泛应用。然而,由于存储和传输等因素影响,图像常常会出现破损,从而影响对其中信息的获取及后续研究。因此,图像修复已成为图像处理领域中备受关注的分支。图像修复是根据图像的已知信息来对缺失信息进行恢复,填充图像缺失像素,以实现整体语义结构一致性和视觉效果的真实性。目前基于深度学习的图像修复方法已经取得了不错的成绩,但是对于一些缺失面积较大的图像和背景结构复杂的高分辨率图像修复效果仍然存在语义真实性和一致性较差的问题。基于此,论文分别从有效提取特征信息和获取高感受野两个方面展开研究,主要包括以下内容:

(1)针对目前不规则掩码的修复结果缺乏边缘一致性和语义正确性的问题,论文设计了一种渐进式生成对抗网络,该网络由边缘向中心进行递归修复,将修复好的特征作为下一次特征生成的条件,逐渐加强对中心内容的约束。为了获取远处信息以及考虑到直接在渐进修复网络中使用注意力可能出现不同递归下特征映射之间不一致的问题,论文设计了一种自适应一致性注意力模块,该模块自适应结合不同递归得到的分数从而捕获更多的特征信息。SN-PatchGAN判别器直接计算输出图上每个点的铰链损失,广泛关注到不同位置以及不同语义。实验结果表明,论文的方法在不规则掩码图像修复任务中表现良好,修复结果在边缘一致性、语义正确性、图像整体结构等方面都有更佳的表现,相比其他算法,分别在PSNR和SSIM指标上平均改进了2.21dB和0.147。

(2)针对高分辨率图像和具有复杂结构的图像存在修复结果不合理或者模糊的问题,论文设计了一种多级特征聚合网络,该网络从不同扩张率的卷积中提取特征,使得网络可以获得更多特征信息,从而恢复更合理的图像缺失内容。在生成网络中设计使用快速傅里叶卷积,使生成器在早期层考虑全局信息,获取全局感受野,从而有利于高分辨率图像修复任务。为了更好的训练生成网络,使用自导回归损失加大对缺失区域的惩罚力度,增强语义细节。实验结果表明,论文设计的方法在几何结构复杂和高分辨率图像修复任务中表现良好,可以提供更合理和清晰的修复结果。与其他几种经典的图像修复方法相比,分别在PSNR、SSIM和LPIPS指标上平均改进了2.69dB、0.042和0.022。

综上所述,论文设计的方法在CelebA、Places2等多个公共数据集上表现良好,渐进式生成对抗网络针对大面积缺失图像,逐步丰富缺失区域信息,修复结果语义明确,结构合理。多级特征聚合网络利用不同扩张率卷积感受野范围的差距,构建具有更大感受野的网络,对高分辨率复杂图像泛化效果良好。在后续研究中,论文将从平衡网络复杂度和视频修复等方面进行更广泛、更深入的尝试。

论文外文摘要:

As one of the important media for human beings to acquire, transmit and preserve information, images are widely used in many fields. However, due to storage and transmission factors, images are often damaged, which affects the acquisition of information and subsequent research. Therefore, image inpainting has become a branch of image processing that has attracted much attention. Image inpainting is to recover the missing information according to the known information of the image and fill in the missing pixels of the image to achieve the consistency of the overall semantic structure and the authenticity of the visual effect. At present, image inpainting methods based on deep learning have achieved good results, but for some images with large missing areas and high-resolution images with complex background structures, the inpainting effect still suffers from poor semantic authenticity and consistency. Based on this, the thesis carries out research from two aspects of effectively extracting feature information and obtaining high sensory field respectively, which mainly includes the following contents:

(1) Aiming at the current problem of lack of edge consistency and semantic correctness in the inpainting results of irregular masks, the paper designs a progressive generative adversarial network, which carries out recursive inpainting from the edges to the centre, takes the restored features as the conditions for the next feature generation, and gradually strengthens the constraints on the centre content. In order to capture distant information as well as considering the problem of inconsistency between feature mappings under different recursions that may occur when using attention directly in a progressive repair network, the paper designs an adaptive coherent attention module, which adaptively combines scores obtained from different recursions to capture more feature information. The SN-PatchGAN discriminator directly calculates the hinge loss at each point on the output graph, the Extensive attention is paid to different locations as well as different semantics. The experimental results show that the paper's method performs well in the irregular mask image inpainting task, and the inpainting results have better performance in terms of edge consistency, semantic correctness, and the overall structure of the image, and compared with other algorithms, it improves the PSNR and SSIM metrics by an average of 2.21dB and 0.147, respectively.

(2) Aiming at the problem of unreasonable or blurred inpainting results for high-resolution images and images with complex structures, the paper designs a multilevel feature aggregation network, which extracts features from convolutions with different dilatation rates, making it possible for the network to obtain more feature information to recover more reasonable missing content of the image. The use of Fast Fourier Convolution is designed in the generative network so that the generator considers the global information in early layers and acquires the global receptive field, which facilitates the task of high-resolution image inpainting. For better training of the generative network, the use of self-guided regression loss increases the penalty for missing regions and enhances the semantic details. Experiments show that the method designed in the paper performs well in geometrically complex and high-resolution image inpainting tasks, providing more reasonable and clearer inpainting results. Compared with several other classical image inpainting methods, an average improvement of 2.69 dB, 0.042 and 0.022 is achieved in PSNR, SSIM and LPIPS metrics, respectively.

In summary, the method designed by the paper performs well on several public datasets such as CelebA and Places2, and the progressive generative adversarial network targets large missing images, gradually enriches the information of missing regions, and the inpainting results are semantically clear and structurally sound. The multilevel feature aggregation network uses the disparity of the range of convolutional receptive fields with different expansion rates to construct a network with a larger receptive field, which has good generalization effect on high-resolution complex images. In the subsequent research, the paper will make more extensive and in-depth attempts from the aspects of smoothing network complexity and video inpainting.

参考文献:

[1] Suvorov R, Logacheva E, Mashikhin A, et al. Resolution-robust large mask inpainting with fourier convolutions[C]//Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2022: 2149-2159.

[2] 刘颖, 佘建初, 公衍超, 等. 基于深度学习的面部修复技术综述[J]. 计算机应用研究, 2021, 38(1): 1-6.

[3] Hui Z, Li J, Wang X, et al. Image fine-grained inpainting[J]. arXiv preprint arXiv: 2002. 02609, 2020: 1-11.

[4] 赵露露, 沈玲, 洪日昌. 图像修复研究进展综述[J]. 计算机科学, 2021, 48(3): 1-13.

[5] Hong C, Yu J, Zhang J, Jin X, Lee K-H. Multi-modal face pose estimation with multi-task manifold deep learning[J]. IEEE Transactions On Industrial Informatics, 2018, 15(7): 3952-3961.

[6] 钟菲, 杨斌. 一种新型的基于深度学习的单幅图像去雨方法[J]. 计算机科学, 2018, 45(11): 283-287.

[7] Isogawa M, Mikami D, Iwai D, Kimata H, Sato K. Mask optimization for image inpainting[C]// IEEE Access, 2018, 6: 69728-69741.

[8] Liu G, Reda F A, Shih K J, et al. Image inpainting for irregular holes using partial convolutions[C]//Proceedings of the European conference on computer vision (ECCV), 2018: 85-100.

[9] Yi Z, Tang Q, Azizi S, et al. Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 7508-7517.

[10] 李海燕, 黄和富, 郭磊, 李海江, 陈建华, 李红松. 基于残缺图像样本的生成对抗网络图像修复方法[J]. 北京航空航天大学学报, 2021, 047(010): 1949-1958.

[11] Wan Z, Zhang B, Chen D, et al. Bringing Old Photos Back to Life[C]// CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 2744-2754.

[12] Lempitsky V, Vedaldi A, Ulyanov D. Deep image prior[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 9446-9454.

[13] Ning X, Weijun L I, Liu W. A Fast Single Image Haze Removal Method Based on Human Retina Property[J]. Ieice Transactions on Information & Systems, 2017, 100(1): 211-214.

[14] 刘媛媛, 彭浩, 代宇婷, 等. 基于深度学习的图像修复研究进展[J]. 软件导刊, 2023, 22(7): 220-226.

[15] Ren Y, Yu X, Zhang R, et al. Structureflow: Image inpainting via structure-aware appearance flow[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 181-190.

[16] 吕建峰, 邵立珍, 雷雪梅. 基于深度神经网络的图像修复算法综述[J]. 计算机工程与应用, 2023, 59(20): 1-12.

[17] Hong S, Marinescu R, Dalca A V, et al. 3D-stylegan: A style-based generative adversarial network for generative modeling of three-dimensional medical images[M]//Deep Generative Models, and Data Augmentation, Labelling, and Imperfections. Springer, Cham, 2021: 24-34.

[18] Bertalmio M, Sapiro G, Caselles V, et al. Image inpainting[C]//Proceedings of the 27th annual conference on Computer graphics and interactive techniques, 2000: 417-424.

[19] Shen Jianhong, Chan Tony. Mathematical Models for Local Nontexture Inpainting[J]. SIAM Journal on Applied Mathematics, 2002, 62(3): 1019-1043.

[20] Chan Tony, Shen Jianhong. Nontexture inpainting by curvature-driven diffusions[J]. Journal of Visual Communication and Image Representation, 2001, 12(4): 436-449.

[21] Ruzic T, Pizurica A. Context-aware patch-based image inpainting using markov random field modeling[C]// IEEE Trans. On Image Processing, 2014, 24(1): 444-456.

[22] Jin KH, Ye JC. Annihilating filter-based low-rank hankel matrix approach for image inpainting[C]// IEEE Trans. On Image Processing, 2015, 24(11): 3498-3511.

[23] Kawai N, Sato T, Yokoya N. Diminished reality based on image inpainting considering background geometry[C]// IEEE Transactions on Visualization and Computer Graphics, 2015, 22(3): 1236-1247.

[24] Liu J, Yang S, Fang Y, Guo Z. Structure-guided image inpainting using homography transformation[C]// IEEE Transactions on Multimedia, 2018, 20(12): 3252-3265.

[25] Ding D, Ram S, Rodriguez JJ. Image inpainting using nonlocal texture matching and nonlinear filtering[J]. IEEE Trans. On Image Processing, 2018, 28(4): 1705-1719.

[26] Barnes C, Shechtman E, Finkelstein A, et al. PatchMatch: A randomized correspondence algorithm for structural image editing[J]. ACM Trans. Graph, 2009, 28(3): 1-24.

[27] Bertalmío Marcelo, Vese Luminita, Sapiro Guillermo, et al. Simultaneous structure and texture image inpainting[J]. IEEE Transactions on Image Processing, 2003, 12(8): 882-889.

[28] Criminisi A, Pérez P, Toyama K. Region filling and object removal by exemplar-based image inpainting[J]. IEEE Transactions on image processing, 2004, 13(9): 1200-1212.

[29] Starck J L, Elad M, Donoho D L. Image decomposition via the Combination of Sparse Representations and a Variational Approach[J]. Image Processing, IEEE Transactions on Image Processing, 2005, 14(10): 1570-1582.

[30] Li H, Luo W, Huang J. Localization of diffusion-based inpainting in digital images[C]// IEEE Transactions on Information Forensics and Security, 2017, 12(12): 3050-3064.

[31] Li K, Wei Y, Yang Z, Wei W. Image inpainting algorithm based on TV model and evolutionary algorithm[J]. Soft Computing, 2016, 20(3): 885-893.

[32] Sridevi G, Kumar SS. Image inpainting based on fractional-order nonlinear diffusion for image reconstruction[J]. Circuits Systems and Signal Processing, 2019, 38(8): 1-16.

[33] Jin X, Su Y, Zou L, Wang Y, Jing P, Wang ZJ. Sparsity-based image inpainting detection via canonical correlation analysis with low-rank constraints[C]// IEEE Access, 2018, 6: 49967-49978.

[34] Mo J, Zhou Y. The research of image inpainting algorithm using self-adaptive group structure and sparse representation[J]. IOP Publishing Ltd, 2018, 22(1): 1-9.

[35] Zhao L, Mo Q, Lin S, et al. Uctgan: Diverse image inpainting based on unsupervised cross-space translation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 5741-5750.

[36] 范春奇, 任坤, 孟丽莎, 等. 基于深度学习的数字图像修复算法最新进展[J]. 信号处理, 2020(1): 1-8.

[37] Wang N, Li J, Zhang L, et al. MUSICAL: Multi-Scale Image Contextual Attention Learning for Inpainting[C]// Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019: 3748-3754.

[38] 彭进业, 余喆, 屈书毅, 等. 基于深度学习的图像修复方法研究综述[J]. 西北大学学报(自然科学版), 2023(6): 943-963.

[39] LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.

[40] 刘科研, 周方泽, 周晖. 基于时序信号图像编码和生成对抗网络的配电网台区数据修复[J]. 电力系统保护与控制, 2022, 50(24): 129-136.

[41] Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. Computer Science, 2014: 1-14.

[42] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[J]. Advances in neural information processing systems, 2014: 1-9.

[43] Iizuka S, Simo-Serra E, Ishikawa H. Globally and locally consistent image completion[J]. ACM Transactions on Graphics (TOG), 2017, 36(4CD): 107.1-107.14.

[44] 孙劲光, 杨忠伟, 黄胜. 全局与局部属性一致的图像修复模型[J]. 中国图象图形学报, 2020, 25(12): 1-12.

[45] Yan Z, Li X, Li M, et al. Shift-net: Image inpainting via deep feature rearrangement[C]//Proceedings of the European conference on computer vision (ECCV), 2018: 1-17.

[46] 白宗文, 弋婷婷, 周美丽, 等. 基于多尺度特征融合的人脸图像修复方法[J]. 计算机工程, 2021, 47(5): 1-9.

[47] 范新刚. 基于深度学习的图像修复技术研究[J]. 江苏科技信息, 2020, 37(8): 1-3.

[48] 布安旭, 马驰, 胡辉, 等. 基于双判别器生成对抗网络的遮挡人脸图像修复算法[J]. 计算机与数字工程, 2023, 51(4): 910-915.

[49] 李雪涛, 王耀雄, 高放. 图像修复方法综述[J]. Laser & Optoelectronics Progress, 2023, 60(2): 1-16.

[50] Li J, Wang N, Zhang L, et al. Recurrent feature reasoning for image inpainting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 7760-7768.

[51] 李天成, 何嘉. 一种基于生成对抗网络的图像修复算法[J]. 计算机应用与软件, 2019, 36(12): 195-200.

[52] 李雪瑾, 李昕, 徐艳杰. 基于生成对抗网络的数字图像修复技术[J].电子测量与仪器学报, 2019(1): 1-7.

[53] Liu H, Jiang B, Xiao Y, et al. Coherent Semantic Attention for Image Inpainting[J]. IEEE, 2019: 4170-4179.

[54] 黄健, 韩俊楠. 基于生成对抗网络的图像修复算法[J]. 计算机系统应用, 2023, 32(10): 215-221.

[55] Zeng Y, Lin Z, Yang J, et al. High-resolution image inpainting with iterative confidence feedback and guided upsampling[C]//European conference on computer vision. Springer, Cham, 2020: 1-17.

[56] Pathak D, Krahenbuhl P, Donahue J, et al. Context Encoders: Feature Learning by Inpainting[C]// IEEE, 10.1109/CVPR, 2016, 3: 2536-2544.

[57] Wang Y, Tao X, Qi X, et al. Image inpainting via generative multi-column convolutional neural networks[J]. Advances in neural information processing systems, 2018, 331-340.

[58] Zeng Y, Fu J, Chao H, et al. Learning pyramid-context encoder network for high-quality image inpainting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 1486-1494.

[59] Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, Generative image inpainting with contextual attention[C]// in: Proc. CVPR, 2018: 5505-5514.

[60] Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, Free-form image inpainting with gated convolution[C]// in: Proc. ICCV, 2019: 4471-4480.

[61] Sagong M C, Shin Y G, Kim S W, et al. PEPSI: Fast Image Inpainting with Parallel Decoding Network[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019: 11360-11368.

[62] W Xiong, Z Lin, J Yang, et al. Foreground-aware Image Inpainting[C]// in: IEEE/CVF Conference on Computer Vision, Pattern Recognition, IEEE, 2019: 5833-5841.

[63] Nazeri K, Ng E, Joseph T, et al. Edgeconnect: Generative image inpainting with adversarial edge learning[J]. arXiv preprint arXiv:1901.00212, 2019: 1-17.

[64] Chang Y L, Yu Liu Z, Hsu W. Vornet: Spatio-temporally consistent video inpainting for object removal[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019: 1-8.

[65] 孟丽莎, 任坤, 范春奇, 等.基于密集卷积生成对抗网络的图像修复[J]. 计算机科学, 2020, 47(8): 1-6.

[66] Su Y Z, Liu T J, Liu K H, et al. Image Inpainting for Random Areas Using Dense Context Features[C]// IEEE International Conference on Image Processing (ICIP), IEEE, 2019: 4679-4683.

[67] 陈俊周, 王娟, 龚勋. 基于级联生成对抗网络的人脸图像修复[J]. 电子科技大学学报, 2019, 48(6): 1-8.

[68] Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution[C]//European conference on computer vision. Springer, Cham, 2016: 694-711.

[69] Liu Z, Luo P, Wang X, et al. Large-scale celebfaces attributes (celeba) dataset[J]. Retrieved August, 2018, 15(2018): 1-11.

[70] Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A. Places: a 10 million image database for scene recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(6): 1452-1464.

[71] Doersch C, Singh S, Gupta A, Sivic J, Efros AA. What makes paris look like paris? [C]// Commun ACM, 2015, 58(12): 103-110.

[72] Tyleček R, Šára R. Spatial pattern templates for recognition of objects with regular structure[C]//German conference on pattern recognition. Springer, Berlin, Heidelberg, 2013: 364-374.

[73] Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252.

[74] Wang X, Yu K, Dong C, et al. Recovering realistic texture in image super-resolution by deep spatial feature transform[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2018: 606-615.

[75] Liu Q M, Jia R S, Liu Y B, et al. Infrared image super-resolution reconstruction by using generative adversarial network with an attention mechanism[J]. Applied Intelligence, 2021, 51(4): 2018-2030.

[76] Zhang R, Isola P, Efros A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2018: 586-595.

[77] Yang C, Lu X, Lin Z, et al. High-resolution image inpainting using multi-scale neural patch synthesis[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2017: 6721-6729.

[78] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV), 2018: 3-19.

[79] Zhao S, Cui J, Sheng Y, et al. Large scale image completion via co-modulated generative adversarial networks[J]. arXiv preprint arXiv:2103. 10428, 2021: 1-25.

[80] Hong-an Li, Liuqing Hu, Qiaozhi Hua, Meng Yang, Xinpeng Li. Image Inpainting based on Contextual Coherent Attention GAN [J]. Journal of Circuits, Systems and Computers, 2022, 31(12): 2250209-1-2250209-20.

[81] Karras T, Aila T, Laine S, et al. Progressive growing of gans for improved quality, stability, and variation[J]. arXiv preprint arXiv:171010196, 2017: 1-26.

[82] H. Li, L. Hu, J. Zhang, Irregular Mask Image Inpainting Based on Progressive Generative Adversarial Networks[J]. Imaging Science Journal, 2023, 71(3): 1-14.

中图分类号:

 TP391    

开放日期:

 2024-06-17    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式