查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于生成对抗网络的图像超分辨率重建方法研究
姓名：	王雕
学号：	21208223080
保密级别：	公开
论文语种：	chi
学科代码：	085400
学科名称：	工学 - 电子信息
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2024
培养单位：	西安科技大学
院系：	计算机科学与技术学院
专业：	软件工程
研究方向：	数字图像处理
第一导师姓名：	李洪安
第一导师单位：	西安科技大学
论文提交日期：	2024-06-13
论文答辩日期：	2024-05-30
论文外文题名：	Research on Image Super-Resolution Reconstruction Method Based on Generative Adversarial Network
论文中文关键词：	超分辨率重建 ; 生成对抗网络 ; 多尺度感知 ; 注意力机制 ; 模型轻量化
论文外文关键词：	Super-resolution reconstruction ; Generative adversarial network ; Multi-scale perception ; Attention mechanism ; Model lightweight
论文中文摘要：	︿图像超分辨率重建是一种图像处理技术，通过增加图像的空间分辨率，使其在细节和清晰度上得到改善。随着人工智能的发展，卷积神经网络的出现使图像超分辨率重建技术成为了研究热点，提高了图像重建的清晰度和质量。基于深度学习的图像超分辨率重建虽然已经取得了显著的成功，但仍然存在伪影、纹理细节模糊、参数量大和运行时间长等问题。本文在生成对抗网络的基础上，通过改进网络结构和损失函数、设计新型模块、增强图像数据等方法对模型进行优化，从而提高图像的重建效果，并使模型更加轻量化。本文的主要研究内容如下： (1) 针对超分辨率重建中存在图像过于平滑和纹理细节模糊的问题，提出了多尺度感知生成对抗网络的图像超分辨率重建方法。首先，采用多分支路径实现多尺度感知特征的提取，使每个分支专注于捕获特定尺度和语义信息。同时，设计通道空间注意力块，增强模型对输入数据的关注能力，更加有效地捕获重要的通道和空间特征，提升模型的重建性能。然后，通过结合增强型残差密集块进一步促使信息在网络中更直接地传递，缓解梯度消失并增强图像纹理细节。最后，将感知损失、对抗损失和像素损失进行加权和，防止图像平滑现象发生，并对模型性能进行优化，提高模型的鲁棒性。实验结果表明，该方法在多个基准数据集上表现出色，客观评价指标明显优于其他方法，能恢复出细节更清晰且真实性更强的超分辨率图像。相比其他算法，分别在PSNR、SSIM和LPIPS上平均改进了1.357dB、0.043和0.025。 (2) 针对超分辨率重建模型中存在参数量大、运行时间长的问题，提出了深度可分离生成对抗网络的轻量级图像超分辨率重建方法。首先，整个网络使用深度可分离卷积，在深度卷积和逐点卷积中捕获空间特征和通道特征，并有效减少参数量。同时，设计特征分离蒸馏块，通过在不同层之间传递特征信息，促使模型学习到更为丰富和高级的特征表示，并将部分特征信息传递到浅层残差块中逐步优化，从而使网络更加轻量化。然后，引入轻量级坐标注意力，关注输入中特定位置的特征，共享注意力权重计算参数，以降低计算复杂度。最后，设计SN-PatchGAN判别网络，捕捉图像局部信息并对每个局部块进行判别，提高对图像结构的感知，从而指导生成网络产生更优质的图像。实验结果表明，该方法在使用更少参数量和更短运行时间的同时，能够重建出质量更高、感官性更强的超分辨率图像。相比其他算法，分别在PSNR、SSIM和LPIPS指标上平均改进了0.982dB、0.051和0.037，参数量平均减少了135K，运行时间平均缩短了0.17s。﹀
论文外文摘要：	︿ Image super-resolution reconstruction is an image processing technique that improves an image in detail and clarity by increasing its spatial resolution. With the development of artificial intelligence, the emergence of convolutional neural network makes the image super-resolution reconstruction technique become a research hotspot, which improves the clarity and quality of image reconstruction. Although image super-resolution reconstruction based on deep learning has achieved remarkable success, it still suffers from artefacts, blurred texture details, large number of parameters and long running time. In this paper, based on generative adversarial networks, the model is optimised by improving the network structure and loss function, designing novel modules, and enhancing the image data, so as to improve the reconstruction of images and make the model more lightweight. The main research content of this paper is as follows. (1) Aiming at the problems of too smooth image and blurred texture details in super-resolution reconstruction, a multi-scale perceptual generative adversarial network is proposed for image super-resolution reconstruction. Firstly, multi-branching paths are used to achieve multi-scale perceptual feature extraction, so that each branch focuses on capturing specific scale and semantic information. At the same time, a channel spatial attention block is designed to enhance the model's ability to focus on the input data, capture important channel and spatial features more effectively, and improve the reconstruction performance of the model. Then, information is further prompted to pass more directly through the network by incorporating enhanced residual density blocks to mitigate gradient vanishing and enhance image texture details. Finally, perceptual loss, adversarial loss and pixel loss are weighted and summed to prevent image smoothing from occurring and optimise model performance to improve model robustness. The experimental results show that the method performs well on several benchmark datasets, and the objective evaluation indexes are significantly better than other methods, recovering super-resolution images with clearer details and greater realism. Compared with other algorithms, it improves on average 1.357dB, 0.043 and 0.025 in PSNR, SSIM and LPIPS metrics, respectively. (2) Aiming at the problem of large number of parameters and long running time of super-resolution reconstruction model, a lightweight image super-resolution reconstruction method with depth-separable generative adversarial network is proposed. Firstly, the whole network uses depth-separable convolution to capture spatial and channel features in deep and point-by-point convolution, and effectively reduce the number of parameters. At the same time, feature separation distillation blocks are designed to induce the model to learn richer and more advanced feature representations by transferring feature information between different layers, and pass some feature information to shallow residual blocks for gradual optimisation, thus making the network more lightweight. Then, lightweight coordinate attention is introduced to focus on features at specific locations in the input, and the attention weight calculation parameters are shared to reduce the computational complexity. Finally, the SN-PatchGAN discriminative network is designed to capture the local information of the image and discriminate each local block to improve the perception of the image structure, thus guiding the generative network to produce better quality images. Experimental results show that the method is capable of reconstructing higher quality and sensory super-resolution images while using fewer parameters and shorter run times. Compared with other algorithms, it improves on average 0.982dB, 0.051 and 0.037 in PSNR, SSIM and LPIPS metrics, respectively, with an average reduction of 135K in the number of parameters and an average reduction of 0.17s in the running time. ﹀
参考文献：	︿ [1] 黄健, 赵元元, 郭苹, 等. 深度学习的单幅图像超分辨率重建方法综述[J]. 算机工程与应用, 2021, 57(18): 13-23. [2] Qiu D, Cheng Y, Wang X. Dual U-Net residual networks for cardiac magnetic resonance images super-resolution[J]. Computer Methods and Programs in Biomedicine, 2022, 218: 106707. [3] Zhang B, Ma M, Wang M, et al. Enhanced resolution of FY4 remote sensing visible spectrum images utilizing super-resolution and transfer learning techniques[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 7391-7399. [4] 汪亚东. 基于通道特征融合与多尺度特征的单图像超分辨率重建算法研究[D]. 中国科学技术大学, 2022. [5] Umirzakova S, Ahmad S, Khan L U, et al. Medical image super-resolution for smart healthcare applications: A comprehensive survey[J]. Information Fusion, 2023: 102075. [6] 王宇骥, 董昊呈, 龚雪鸾, 等. 基于潜在注意力的高性能视频超分辨率技术[J]. 计算机科学, 2023, 50(S2): 228-237. [7] 贺温磊, 王朝立, 孙占全. 基于生成对抗网络的遥感图像超分辨率重建[J]. 信息与控制, 2021, 50(2): 195-203. [8] Lepcha D C, Goyal B, Dogra A, et al. Image super-resolution: A comprehensive review, recent trends, challenges and applications[J]. Information Fusion, 2023, 91: 230-260. [9] 张艳青, 马建红, 韩颖, 等. 真实场景下图像超分辨率重建研究综述[J]. 计算机工程与应用, 2023, 59(8): 28-40. [10] Li H A, Wang D, Li Z, et al. Image Super-Resolution Reconstruction Based on Big Data and Cloud Computing[C]//2022 IEEE 7th International Conference on Smart Cloud (SmartCloud). IEEE, 2022: 177-183. [11] 邢苏霄, 陈金玲, 李锡超, 等. 基于深度学习的单图像超分辨率重建综述[J]. 计算机系统应用, 2022, 31(7): 23-34. [12] Harris J L. Diffraction and resolving power[J]. JOSA, 1964, 54(7): 931-936. [13] 李展, 张庆丰, 孟小华, 等. 多分辨率图像序列的超分辨率重建[J]. 自动化学报, 2012, 38(11): 1804-1814. [14] Tsai R Y, Huang T S. Multiframe image restoration and registration[J]. Multiframe image restoration and registration, 1984, 1: 317-339. [15] Kim S P, Bose N K, Valenzuela H M. Recursive reconstruction of high resolution image from noisy undersampled multiframes[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1990, 38(6): 1013-1027. [16] Jiang N, Wang J, Mu Y. Quantum image scaling up based on nearest-neighbor interpolation with integer scaling ratio[J]. Quantum information processing, 2015, 14(11): 4001-4026. [17] Accadia C, Mariani S, Casaioli M, et al. Sensitivity of precipitation forecast skill scores to bilinear interpolation and a simple nearest-neighbor average method on high-resolution verification grids[J]. Weather and forecasting, 2003, 18(5): 918-932. [18] Xia P, Tahara T, Kakue T, et al. Performance comparison of bilinear interpolation, bicubic interpolation, and B-spline interpolation in parallel phase-shifting digital holography[J]. Optical review, 2013, 20: 193-197. [19] Stark H, Oskoui P. High-resolution image recovery from image-plane arrays, using convex projections[J]. JOSA A, 1989, 6(11): 1715-1726. [20] Irani M, Peleg S. Improving resolution by image registration[J]. CVGIP: Graphical models and image processing, 1991, 53(3): 231-239. [21] 陈晓, 荆茹韵. 单图像超分辨率方法综述[J]. 电子测量技术, 2022, 45(09): 104-112. [22] 李岩, 杨得成, 于光华, 等. 基于深度学习的红外图像超分辨率重建方法[J]. 激光杂志, 2024, 45(01): 142-147. [23] Dong C, Loy C C, He K, et al. Image super-resolution using deep convolutional networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 38(2): 295-307. [24] Dong C, Loy C C, Tang X. Accelerating the super-resolution convolutional neural network[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer International Publishing, 2016: 391-407. [25] Shi W, Caballero J, Huszár F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 1874-1883. [26] He K, Zhang X, Ren S, et al. Identity mappings in deep residual networks[C] //Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14. Springer International Publishing, 2016: 630-645. [27] Kim J, Lee J K, Lee K M. Accurate image super-resolution using very deep convolutional networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 1646-1654. [28] Kim J, Lee J K, Lee K M. Deeply-recursive convolutional network for image super-resolution[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 1637-1645. [29] 梁敏, 王昊榕, 张瑶, 等. 基于加速残差网络的图像超分辨率重建方法[J]. 计算机应用, 2021, 41(05): 1438-1444. [30] 王拓然, 程娜, 丁士佳, 等. 基于自适应注意力融合特征提取网络的图像超分辨率[J]. 计算机应用研究, 2023, 40(11): 3472-3477. [31] Zhang Y, Li K, Li K, et al. Image super-resolution using very deep residual channel attention networks[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 286-301. [32] Jaderberg M, Simonyan K, Zisserman A. Spatial transformer networks[J]. Advances in neural information processing systems, 2015, 28: 1-8. [33] 臧永盛, 周冬明, 王长城, 等. 基于层次注意力的图像超分辨率重建[J]. 无线电工程, 2021, 51(11): 1245-1253. [34] Ledig C, Theis L, Huszár F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4681-4690. [35] Wang X, Yu K, Wu S, et al. Esrgan: Enhanced super-resolution generative adversarial networks[C]//Proceedings of the European conference on computer vision (ECCV) workshops. 2018:1-16. [36] Arjovsky M, Chintala S, Bottou L. Wasserstein generative adversarial networks[C]//International conference on machine learning. PMLR, 2017: 214-223. [37] Zhang H, Goodfellow I, Metaxas D, et al. Self-attention generative adversarial networks[C]//International conference on machine learning. PMLR, 2019: 7354-7363. [38] 贺智明, 黄志成. 基于坐标注意力生成对抗网络的图像超分辨率重建[J/OL]. 微电子学与计算机, 2023, 40(12): 35-44. [39] Shimizu T, Xu J, Tasaka K. MobileGAN: Compact Network Architecture for Generative Adversarial Network[C]//Asian Conference on Pattern Recognition. Cham: Springer International Publishing, 2019: 326-338. [40] 唐艳秋, 潘泓, 朱亚平, 等. 图像超分辨率重建研究综述[J]. 电子学报, 2020, 48(7): 1407-1420. [41] LeCun Y, Kavukcuoglu K, Farabet C. Convolutional networks and applications in vision[C]//Proceedings of 2010 IEEE international symposium on circuits and systems. IEEE, 2010: 253-256. [42] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25(2): 1-7. [43] Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]//International conference on machine learning. pmlr, 2015: 448-456. [44] Ba J L, Kiros J R, Hinton G E. Layer normalization[J]. arXiv preprint arXiv: 1607.06450, 2016: 1-10. [45] Han J, Moraga C. The influence of the sigmoid function parameters on the speed of backpropagation learning[C]//International workshop on artificial neural networks. Berlin, Heidelberg: Springer Berlin Heidelberg, 1995: 195-201. [46] Malfliet W, Hereman W. The tanh method: I. Exact solutions of nonlinear evolution and wave equations[J]. Physica Scripta, 1996, 54(6): 563-568. [47] Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks[C]//Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011: 315-323. [48] Maas A L, Hannun A Y, Ng A Y. Rectifier nonlinearities improve neural network acoustic models[C]//Proc. icml. 2013, 30(1): 3-7. [49] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144. [50] 杜均森, 郭杰龙, 俞辉, 等. 基于卷积稀疏编码与生成对抗网络的图像超分辨率重建[J]. 液晶与显示, 2023, 38(10): 1423-1433. [51] 刘郭琦, 刘进锋, 朱东辉. 基于生成对抗网络的图像超分辨率重建算法[J]. 液晶与显示, 2021, 36(12): 1720-1727. [52] Li H, Wang D, Zhang J, et al. Image super-resolution reconstruction based on multi-scale dual-attention[J]. Connection Science, 2023, 35(1): 2182487. [53] Zhang W, Zhao W, Li J, et al. CVANet: Cascaded visual attention network for single image super-resolution[J]. Neural Networks, 2024, 170: 622-634. [54] Li K, Yang S, Dong R, et al. Survey of single image super-resolution reconstruction[J]. IET Image Processing, 2020, 14(11): 2273-2290. [55] Chen Y, Xia R, Yang K, et al. MFFN: image super-resolution via multi-level features fusion network[J]. The Visual Computer, 2023: 1-16. [56] Zhang X, Peng Y, Wang W, et al. Image Super-Resolution Based on Gated Residual and Gated Convolution Networks[J]. Neural Processing Letters, 2023, 55(9): 11807-11821. [57] Yang X, Li H, Li X, et al. HIFGAN: A High-Frequency Information-Based Generative Adversarial Network for Image Super-Resolution[J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2023, 19(5): 1-19. [58] 邓酩, 柳庆龙, 侯立宪. 多尺度残差生成对抗网络的图像超分辨率重建[J]. 科学技术与工程, 2023, 23(31): 13472-13481. [59] Wang X, Yu K, Dong C, et al. Recovering realistic texture in image super-resolution by deep spatial feature transform[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 606-615. [60] Liu Q M, Jia R S, Liu Y B, et al. Infrared image super-resolution reconstruction by using generative adversarial network with an attention mechanism[J]. Applied Intelligence, 2021, 51(4): 2018-2030. [61] Zhang R, Isola P, Efros A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 586-595. [62] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 2818-2826. [63] Zhu Y, Newsam S. Densenet for dense flow[C]//2017 IEEE international conference on image processing (ICIP). IEEE, 2017: 790-794. [64] Miyato T, Kataoka T, Koyama M, et al. Spectral normalization for generative adversarial networks[J]. arXiv preprint arXiv:1802.05957, 2018: 1-11. [65] Tu J, Mei G, Ma Z, et al. SWCGAN: Generative adversarial network combining swin transformer and CNN for remote sensing image super-resolution[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 5662-5673. [66] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C] //Proceedings of the European conference on computer vision (ECCV). 2018: 3-19. [67] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916. [68] Lin T Y, Maire M, Belongie S, et al. Microsoft coco: Common objects in context[C]//Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer International Publishing, 2014: 740-755. [69] Bevilacqua M, Roumy A, Guillemot C, et al. Low-complexity single-image super-resolution based on nonnegative neighbor embedding[C]//British Machine Vision Conference.BMVA Press, 2012: 2-10. [70] Zeyde R, Elad M, Protter M. On single image scale-up using sparse-representations[C]//Curves and Surfaces: 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers 7. Springer Berlin Heidelberg, 2012: 711-730. [71] Martin D, Fowlkes C, Tal D, et al. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics[C]//Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001. IEEE, 2001, 2: 416-423. [72] Liu J, Zhang W, Tang Y, et al. Residual feature aggregation network for image super-resolution[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 2359-2368. [73] Yang X, Xie T, Liu L, et al. Image super-resolution reconstruction based on improved Dirac residual network[J]. Multidimensional Systems and Signal Processing, 2021, 32(4): 1065-1082. [74] Sun X, Zhao Z, Zhang S, et al. Image super-resolution reconstruction using generative adversarial networks based on wide-channel activation[J]. Ieee Access, 2020, 8:33838-33854. [75] Wang Y, Li X, Nan F, et al. Image super-resolution reconstruction based on generative adversarial network model with feedback and attention mechanisms[J]. Multimedia Tools and Applications, 2022, 81(5): 6633-6652. [76] 吕佳, 许鹏程. 多尺度自适应上采样的图像超分辨率重建算法[J]. 计算机科学与探索, 2023, 17(4): 879-891. [77] Hendrycks D, Gimpel K. Gaussian error linear units (gelus)[J]. arXiv preprint arXiv:1606.08415, 2016: 1-6. [78] Chen X, Sun C. Multiscale recursive feedback network for image super-resolution[J]. IEEE Access, 2022, 10: 6393-6406. [79] Shaham T R, Dekel T, Michaeli T. Singan: Learning a generative model from a single natural image[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 4570-4580. [80] 盘展鸿, 朱鉴, 迟小羽, 等. 基于特征融合和注意力机制的图像超分辨率模型[J]. 计算机应用研究, 2022, 39(3): 884-888. [81] Barron J T. A general and adaptive robust loss function[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 4331-4339. [82] Sun K, Simon S. Bilateral spectrum weighted total variation for noisy-image super-resolution and image denoising[J]. IEEE Transactions on Signal Processing, 2021, 69: 6329-6341. [83] Yang X, Li H, Li X. Lightweight image super-resolution with feature cheap convolution and attention mechanism[J]. Cluster Computing, 2022, 25(6): 3977-3992. [84] Li Z, Liu Y, Chen X, et al. Blueprint separable residual network for efficient image super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 833-843. ﹀
中图分类号：	TP391
开放日期：	2024-06-14

附件下载