- 无标题文档
查看论文信息

论文中文题名:

 基于生成对抗网络的图像修复技术研究    

姓名:

 刘畅    

学号:

 18207205041    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 085208    

学科名称:

 工学 - 工程 - 电子与通信工程    

学生类型:

 硕士    

学位级别:

 工程硕士    

学位年度:

 2021    

培养单位:

 西安科技大学    

院系:

 通信与信息工程学院    

专业:

 电子与通信工程    

研究方向:

 数字图像处理    

第一导师姓名:

 张威虎    

第一导师单位:

  西安科技大学    

第二导师姓名:

 白宗文    

论文提交日期:

 2021-06-21    

论文答辩日期:

 2021-06-04    

论文外文题名:

 Research on Image Inpainting Technology Based on Generative Adversarial Network    

论文中文关键词:

 图像修复 ; 生成对抗网络 ; 长短时记忆网络 ; 注意力机制    

论文外文关键词:

 Image Inpainting ; Generative Adversarial Network ; Long Short Term Memory Network ; Attention Mechanism    

论文中文摘要:

图像修复是利用合理的像素值填充原本图像中缺失或被遮挡区域的技术。传统的图像修复方法在针对修复区域结构复杂、包含强语义信息的图像时难以修复。而生成对抗网络通过生成器与判别器之间的对抗学习、相互优化,以此可以生成以假乱真的样本,该特性使得生成对抗网络非常契合图像修复,因此基于生成对抗网络的图像修复技术研究具有重要意义。

针对编码器采样时信息丢失的问题,本文使用空洞卷积代替普通卷积,以获得更大的感受野,减少信息丢失。为了稳定模型的训练,采用了谱归一化方法,以使得判别器满足Lipschitz连续。为解决目前修复方法中语义信息获取不足而导致修复细节不佳的问题,对现有编码器进行了改进,改进后的编码器从高层语义特征图中学习区域相似度,通过注意力转移网络将学习到的注意力转移到低级特征图中,指导低层修复,实现不同层次的特征修复。为改善现有的图像修复方法一次性修复、修复任务量大导致修复结果不连贯、清晰度低的问题,本文采用分步式修复方式,将整体修复任务看作是多个子任务的和,每一个子任务只负责其中的一部分,且均以上一个子任务为基础,最终通过长短记忆网络连接,完成整体修复任务。基于本文所改进的修复方法,分别在CelabA与ImageNet两个数据集中进行实验验证。

实验结果表明,本文所改进的图像修复方法能达到良好的修复效果。且与上下文编码器、具有上下文注意两种方法相比,改进方法在主观视觉评价与三种客观评价指标(峰值信噪比、结构性相似、平均绝对误差)下均取得了最优的修复效果。本文的研究成果可以丰富图像修复技术的研究,为图像修复技术发展提供理论参考,且在人脸去遮挡、文物修复、生物医学成像等方面具有一定的应用价值。

论文外文摘要:

Image inpainting is a method to fill the missing or occluded areas in the original image with reasonable pixel values. Traditional image inpainting methods are difficult to repair the damaged image with complex structure and strong semantic information. The generative adversarial network can generate false data through the confrontation learning and mutual optimization between the generator and the discriminator, which makes the generative adversarial network very suitable for image inpainting. Therefore, the research on image inpainting technology based on generative adversarial network is of great significance.

In order to solve the problem of information loss in down sampling, hole convolution is used instead of ordinary convolution to obtain larger receptive field and reduce information loss. In order to stabilize the training of the model, the spectral normalization method is used to make the discriminator satisfy Lipschitz continuity. In order to solve the problem of poor repair details caused by insufficient semantic information acquisition in current repair methods, an improved encoder is proposed. The encoder learns region similarity from high-level semantic feature graph, and transfers the learned attention to low-level feature graph through attention transfer network to guide low-level repair and realize feature repair at different levels. In order to improve the existing image inpainting methods, such as one-time inpainting, large amount of inpainting tasks lead to incoherent results and low definition, this paper adopts the step-by-step inpainting method, and regards the whole inpainting task as the sum of several subtasks, each subtask is only responsible for a part of them, and is based on the above subtasks, and finally connected through the long short term memory Network, the overall repair results were composed. Based on the improved repair method, the repair models are tested in two datasets of CelabA and Imagenet.

The experimental results show that the improved image inpainting technology can achieve the purpose of good image inpainting. Compared with the context coder and the method with context attention, the improved method achieves the best repair effect under the subjective visual evaluation and three objective evaluation indexes (peak signal-to-noise ratio, structural similarity and average absolute error). The research results of this paper can enrich the research of image inpainting technology, provide theoretical reference for the development of image inpainting technology, and have certain application value in face occlusion, cultural relic restoration, biomedical imaging and so on.

参考文献:

[1] 宋孝忠,张群. 煤岩显微组分组图像自动识别系统与关键技术[J].煤炭学报, 2019, 44(10): 3085-3097.

[2] 高惠琳. 基于卷积神经网络的军事图像分类[J]. 计算机应用研究,2017, 34(11): 323-325.

[3] 段乃侠. 基于多媒体图像技术的招贴信息自动分类方法研究[J]. 自动化与仪器仪表, 2019, 39(07): 18-21.

[4] Xia L M, Wang H, Guo W T. Gait Recognition Based on Wasserstein Generating Adversarial Image Inpainting Network[J]. Journal of Central South University, 2019, 26(10):2759-2770.

[5] Zhang N, Ji H, Liu L, et al. Exemplar-based Image Inpainting Using Angle-aware Patch Matching[J]. Eurasip Journal on Image & Video Processing, 2019, 2019(1): 70.

[6] Ghorai M, Samanta S, Mandal S, et al. Multiple Pyramids Based Image Inpainting Using Local Patch Statistics and Steering Kernel Feature[J]. IEEE Transactions on Image Processing, 2019, 28(11): 5495-5509.

[7] Cornelis B, Gezels E, et al. Crack Detection and Inpainting for Virtual Restoration of Paintings: The Case of the Ghent Altarpiece[J]. Signal Processing, 2013,93(3):605-619.

[8] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative Adversarial Nets[C]// Proceedings of Advances in Neural Information Processing Systems, 2014: 2672-2680.

[9] Mirza M, Osindero S. Conditional Generative Adversarial Nets[J]. Computer Science, 2014: 2672-2680.

[10] Denton E, Chintala S, Szlam A, et al. Deep Generative Image Models Using A Laplacian Pyramid of Adversarial Networks[J]. Computer Science, 2015:1486-1494.

[11] Radford A, Metz L, Chintala S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J].Springer Charm, 2016:1469-1479.

[12] Zhang X, Zou Y, Wei S. Dilated Convolution Neural Network with LeakyReLU for Environmental Sound Classification[J]. Signal Processing, 2017:1067-1079.

[13] Wang X, Gupta A. Generative Image Modeling Using Style and Structure Adversarial Networks[C]//European Conference on Computer Vision. Springer International Publishing, 2016: 318-335.

[14] Yi Z, Zhang H, Tan P, et al. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation[J]. IEEE Computer Society, 2017:2180-2188.

[15] Chen X , Duan Y , Houthooft R , et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets[J]. IEEE Computer Society, 2017: 148-165.

[16] Arjovsky M, Bottou L. Towards Principled Methods for Training Generative Adversarial Networks[J]. Stat, 2017: 1050-1057.

[17] Yao Z, Hao D, Liu F, et al. Conditional Image Synthesis Using Stacked Auxiliary Classifier Generative Adversarial Networks[J]. Springer, Cham, 2018:526-540.

[18] Zhang H, Goodfellow I, Metaxas D N, et al. Self-Attention Generative Adversarial Networks[J]. IEEE Access,2018:7354-7363.

[19] Zhu J Y, Park T, Isola P, et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks[J] IEEE Computer Society, 2017:2423-2433.

[20] Bertalmio M,Sapiro G,Caselles V,et al.Image inpainting[C]//Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques.2020:417-424.

[21] Oliveira M, Brian B, Richard M, et al. Fast digital image inpainting[C]//VIIP 2001:261-266.

[22] Chan T F, Shen J. Nontexture inpainting by curvature-driven diffusions[J].Journal of visual Communication and Image Representation, 2001,12(4):436-449.

[23] Chan T F, Shen J. Nontexture inpainting by curvature-driven diffusions[J].Journal of visual Communication and Image Representation, 2001,12(4):436-449.

[24] Bertalmio M, Vese L, Sapiro G, et al. Simultaneous structure and texture image inpainting[J].IEEE Transactions on Image Processing,,2003,12(8):882-889.

[25] Criminisi A , Perez P , Toyama K . Region Filling and Object Removal by Exemplar-Based Image Inpainting[J]. IEEE Transactions on Image Processing, 2004, 13(9):1200-1212.

[26] Cheng W H , Hsieh C W , Lin S K , et al. Robust Algorithm for Exemplar-based Image Inpainting[C]// Proceeding of International Conference on Computer Graphics,Imaging and Visualization,2005:64-69.

[27] Brock A, Donahue J, Simonyan K. Large Scale GAN Training for High Fidelity Natural Image Synthesis[J]. Springer, Cham, 2018:1526-1540.

[28] Girshick R, Donahue J, Darrell T, et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Sementation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014:580-587.

[29] Smirnov E A , Timoshenko D M , Andrianov S N . Comparison of Regularization Methods for ImageNet Classification with Deep Convolutional Neural Networks[J]. AASRI Procedia, 2014:1097-1105.

[30] Long j, shelhamer E, Darrell T. Fully Convolutional Networks for Semantic Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 3431-3440.

[31] Ren J sJ, Xu L, Yan Q, et al. Shepard Convolutional Neural Networks[J]. IEEE Transactions on Image Processing, 2015:901-909.

[32] Pathak D, Krahenbuhl P, Donahue J, et al. Context encoders: Feature Learning by Inpainting[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016:2536-2544.

[33] Iizuka S, Simo-Serra E, Ishikawa H. Globally and locally consistent image completion[J]. ACM Transactions on Graphics, 2017, 36(4):1-14.

[34] Liu G, Reda F A, Shih K J, et al. Image Inpainting for Irregular Holes Using Partial Convolutions[C]//Proceedings of the European Conference on Computer Vision, 2018:85-100.

[35] Yu J, Lin Z, Yang J, et al. Free-form image inpainting with gated convolution[C]//Proceedings of the IEEE International Conference on Computer Vision, 2018:4471-4480.

[36] Wang Y, Tao X, Qi X, et al. Image Inpainting via Generative Multi-column Convolutional Neural Networks[C]//Advances in Neural Information Processing Systems,2018:331-340.

[37] Shen L, Hong R, Zhang H, et al. Single-shot Semantic Image Inpainting with Densely Connected Generative Networks [J] IEEE Transactions on Image Processing.2019:1861-1869.

[38] Yu T, Guo Z, Jin X, et al. Region Normalization for Image Inpainting[C]//AAAI, 2020:12733-12740.

[39] Watanabe T . Minimum-cost augmentation to 3-edge-connect all specified vertices in a graph[J]. ICCV Workshops, 2019:165-175.

[40] Wang Y, Chen Y C, Tao X, et al. VCNet: A Robust Approach to Blind Image Inpainting[J]. Springer, Cham, 2020:749-762.

[41] Yu J, Lin Z, Yang J, et al. Generative Image Inpainting with Contextual Attention[C]//Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, 2018:5505-5514.

[42] Liu H, Jiang B, Xiao Y, et al. Coherent Semantic Attention for Image Inpainting[C]//Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, 2019:4170-4179.

[43] Sagong M, Shin Y, Kim S, et al. Pepsi: Fast image inpainting with parallel decoding network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019:11360-11368.

[44] Liu H, Jiang B, Song Y, et al. Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations[C]//Proceedings of the European Conference on Computer Vision, 2020:75-100.

[45] Zeng Y, Lin Z, Yang J, et al. High-Resolution Image In painting with Iterative Confidence Feedback and Guided Up-sampling [J]. IEEE Transactions on Image Processing, 2020:412-424.

[46] Song Y, Yang C, Shen Y, et al. Spg-net: Segmentation prediction and guidance network for image inpainting[J]. IEEE Transactions on Image Processing .2018:975-987.

[47] Xiong W, Yu J, Lin Z, et al. Foreground-aware image inpainting[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019:5840-5848.

[48] Nazeri K, Ng E, Joseph T, et al. Edge-Connect Generative image inpainting with adversarial edge learning[J]. IEEE Transactions on Image Processing, 2019: 1057-1072.

[49] Ren Y, Yu X, Zhang R, et al. Struct ureflow: Image inpainting via structure-aware appearance flow[C]//Proceedings of the IEEE International Conference on Computer Vision, 2019:181-190.

[50] Zhou T, Tulsiani S, Sun W, et al. View synthesis by appearance flow[C]//European Conference on Computer Vision. Springer, Cham, 2016:286-301.

[51] Liao L, Xiao J, Wang Z, et al. Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed Scenes [J]. Springer, Cham, 2020:1597-1612.

[52] 郭鹏,杨晓琴.博弈论与纳什均衡[J]. 哈尔滨师范大学自然科学学报,2006,22(4):25-28.

[53] Thomas F, Christopher K. Deep learning with long short-term memory networks for air quality forecast [J]. European Journal of Operational Research, 2018,270(2):654-669.

[54] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation [J]. IEEE Transactions on Image Processing, 2015:864-878.

[55] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module [C]//Proceedings of the European Conference on Computer Vision, 2018:3-19.

[56] Li C . Worst-case mean square error (MSE) transceiver design for imperfect estimate multi-input-multi-output communication channels[J]. IET Communications, 2012, 6(16):2553-2560.

[57] Hore A, Ziou D Image quality metrics: PSNR VS SSIM[C]//20th International Conference on Patten recognition,2010:2366-2369.

[58] Wang Z, Bovik A C, Sheikh HR, Simoncelli E.P. Image quality assessment: from error visibility to structural similarity[J]. IEEE Trans. On Image Process, 2004,13(4):600-612.

中图分类号:

 TP391    

开放日期:

 2021-06-22    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式