查看论文信息

查看全文

免费浏览

查看论文信息

论文中文题名：	图信息嵌入的多尺度图像彩色化研究
姓名：	郑峭雪
学号：	20208088025
保密级别：	公开
论文语种：	chi
学科代码：	083500
学科名称：	工学 - 软件工程
学生类型：	硕士
学位级别：	工学硕士
学位年度：	2023
培养单位：	西安科技大学
院系：	计算机科学与技术学院
专业：	软件工程
研究方向：	人工智能与信息处理
第一导师姓名：	李洪安
第一导师单位：	西安科技大学
论文提交日期：	2023-06-13
论文答辩日期：	2023-06-05
论文外文题名：	Research of Multi-scale Image Colorization using Graph Information Embedding
论文中文关键词：	图像彩色化 ; 深度学习 ; 损失函数 ; 多尺度特征 ; 图神经网络
论文外文关键词：	image colorization ; deep learning ; loss function ; multi-scale feature ; graph neural network
论文中文摘要：	︿图像彩色化可以为图像中的每个像素点重新赋予颜色值，其在图像修复、医疗诊断、工业生产和影视制作等方面都具有广泛的应用前景。基于深度学习的图像彩色化方法极大地提高了彩色化质量、减少了人为干预程度，但现有算法在复杂图像上仍存在颜色溢出、颜色丰富性不佳、色调不一致和控制性较低等问题。因此，本文主要针对复杂图像的彩色化质量和控制性问题进行研究，其主要研究内容和创新点如下： (1) 针对复杂图像颜色溢出和色调不一致问题，提出多尺度特征感知的图像彩色化方法。该方法将端到端的U-Net和PatchGAN分别作为生成器和判别器，通过对抗训练生成彩色图像。其中，为了更好地对复杂图像进行彩色化，在U-Net跳越连接中增加多尺度特征表示模块，提高多尺度特征提取能力；此外，通过引入感知网络的方法，在生成彩色图像和真实图像的不同尺度特征之间计算感知相似度，提高彩色化效果的色调一致性。实验结果表明，该算法可以有效改善图像彩色化质量不佳的问题。相比其他算法，分别在PSNR、SSIM、LPIPS和FID指标上平均改进了1.766dB、4.405%、0.027和14.429。 (2) 针对图像颜色丰富性和控制性较差问题，提出局部图信息嵌入的图像彩色化方法。该方法将图像彩色化分为两个阶段，首先通过图神经网络将用户输入的颜色点进行扩散，然后再使用彩色化网络提升图像的整体质量。其中，为了增加用户对彩色化结果的控制性，通过在像素的8阶相邻领域内构建关系图，并利用图神经网络实现颜色标记的扩散，以得到初步彩色化图像。此外，通过设计自适应全局颜色特征控制和引入自注意力机制，提高输出彩色化图像与输入颜色点的一致性和整体彩色化质量。实验结果表明，该算法可以提高图像彩色化效果和控制性，实现图像的二次上色。相比其他算法，分别在PSNR、SSIM、LPIPS和FID指标上平均改进了1.273Db、2.733%、0.015和6.119。综上所述，本文提出的图像彩色化方法可以改善图像的彩色化质量和控制性，在影视制作和艺术创作等领域具有一定的应用前景。在未来的工作中，可以从模型轻量化、少样本学习等方面做出进一步改进。﹀
论文外文摘要：	︿ Image colorization methods can reassign color values to each pixel point in an image, which has a wide range of applications in image restoration, medical diagnosis, industrial production, and film creation, etc. Deep learning-based image colorization methods have significantly improved colorization quality and reducing the amount of human interaction, current algorithms remain to have issues with color overflow, poor color richness, inconsistent tones, and poor control on complicated images. Therefore, this paper focuses on the quality and controllability of colorization of complex images, and its primary research findings and innovations are as follows.: (1) For the problems of color overflow and tonal inconsistency in complex images, this paper proposes a multi-scale feature perceptual method for image colorization. The method uses end-to-end U-Net and PatchGAN as a generator and discriminator, respectively, to generate color images by adversarial training. Among them, to better colorize complex images, the multi-scale feature extraction capability is improved by adding a multi-scale feature representation block to the U-Net skip connection. Furthermore, the tonal consistency of the colorization effect is improved by introducing a perceptual network to calculate the perceptual similarity between the different scale features of the generated color image and the real image. The experimental results show that the method effectively solves the problem of poor image colorization quality. Compared to other methods, the average improvements in PSNR, SSIM, LPIPS and FID metrics were 1.766dB, 4.405%, 0.027 and 14.429 respectively. (2) For the problem of poor image color richness and control, this paper proposes an image colorization method using local graph information embedding. The method divides the image colorization into two stages, firstly diffusing the color points input by users through the graph neural network, and then using the colorization network to enhance the overall quality of the image. Furthermore, to increase user control over the colorization results, a relational graph is constructed in the 8th-order adjacent domain of pixels. And the graph neural network is used to realize the diffusion of color marks to obtain the preliminary-colored image. In addition, the output consistency with the input color strokes and overall quality is improved by designing adaptive global color feature control and introducing self-attention. The experimental results show that the method can improve the effect of image colorization and control, as well as realize the secondary coloring of the image. Compared to other methods, the average improvements in PSNR, SSIM, LPIPS and FID metrics were 1.273dB, 2.733%, 0.015 and 6.119 respectively. To summarize, the image colorization methods proposed in this paper can increase image colorization quality and controllability, and has certain application prospects in fields such as film production and artistic creation, etc. In future work, further improvements can be made in areas such as model lightweight and small sample learning, etc. ﹀
参考文献：	︿ [1] Li F, Bao Z, Liu R, et al. Fast image inpainting and colorization by Chambolle’s dual method[J]. Journal of Visual Communication and Image Representation, 2011, 22(6): 529—542. [2] Liang Y, Lee D, Li Y, et al. Unpaired medical image colorization using generative adversarial network[J]. Multimedia Tools and Applications, 2022, 81(19): 26669—26683. [3] Suárez P L, Sappa A D, Vintimilla B X. Infrared image colorization based on a triplet dcgan architecture[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2017: 212—217. [4] Zhang B, He M, Liao J, et al. Deep exemplar-based video colorization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 8044—8053. [5] Levin A, Lischinski D, Weiss Y. Colorization using optimization[M]//ACM SIGGRAPH 2004 Papers. New York: ACM, 2004: 689—694. [6] Yatziv L, Sapiro G. Fast image and video colorization using chrominance blending[J]. IEEE Transactions on Image Processing, 2006, 15(5): 1120—1129. [7] Qu Y, Wong T T, Heng P A. Manga colorization[J]. ACM Transactions on Graphics (tog), 2006, 25(3): 1214—1220. [8] 李洪安, 张敏, 杜卓明, 等. 一种基于分块特征的交互式图像色彩编辑方法[J]. 红外与激光工程, 2019, 48(12): 293—298. [9] Sangkloy P, Lu J, Fang C, et al. Scribbler: Controlling deep image synthesis with sketch and color[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6836—6845. [10] Zhang Q, Xiao C, Sun H, et al. Palette-based image recoloring using color decomposition optimization[J]. IEEE Transactions on Image Processing, 2017, 26(4): 1952—1964. [11] Reinhard E, Adhikhmin M, Gooch B, et al. Color transfer between images[J]. IEEE Computer Graphics and Applications, 2001, 21(5): 34—41. [12] Ruderman D L, Cronin T W, Chiao C C. Statistics of cone responses to natural images: implications for visual coding[J]. Journal of the Optical Society of America A, 1998, 15(8): 2036—2045. [13] Welsh T, Ashikhmin M, Mueller K. Transferring color to greyscale images[C]//Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 2002: 277—280. [14] 曹丽琴, 商永星, 刘婷婷, 等. 局部自适应的灰度图像彩色化[J]. 中国图象图形学报, 2019, 24(08): 1249—1257. [15] He M, Chen D, Liao J, et al. Deep exemplar-based colorization[J]. ACM Transactions on Graphics (TOG), 2018, 37(4): 1—16. [16] Wu Y, Wang X, Li Y, et al. Towards vivid and diverse image colorization with generative color prior[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 14357—14366. [17] Cheng Z, Yang Q, Sheng B. Deep colorization[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 415—423. [18] Larsson G, Maire M, Shakhnarovich G. Learning representations for automatic colorization[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2016: 577—593. [19] Johari M M, Behroozi H. Context-aware colorization of gray-scale images utilizing a cycle-consistent generative adversarial network architecture[J]. Neurocomputing, 2020, 407: 94—104. [20] Su J W, Chu H K, Huang J B. Instance-aware image colorization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 7965—7974. [21] Yoo S, Bahng H, Chung S, et al. Coloring with limited data: Few-shot colorization via memory augmented networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 11275—11284. [22] Qin X, Li M, Liu Y, et al. An efficient coding‐based grayscale image automatic colorization method combined with attention mechanism[J]. IET Image Processing, 2022, 16(7): 1765—1777. [23] 古大治. 色彩与图形视觉原理[M]. 北京: 科学出版社, 2000: 1—13. [24] 李飞, 王克逸. RGB三通道衍射望远镜光学成像系统设计[J]. 应用光学, 2019, 40(03): 369—372. [25] 徐武, 文聪, 唐文权, 郭兴. 基于Lab颜色空间的融合改进二进制量子PSO和Otsu优化算法[J]. 计算机应用与软件, 2022, 39(06): 265—268. [26] Burger, W, Burge, M J. Principles of Digital Image Processing: Fundamental Techniques[M]. Springer: London, 2009:1—47. [27] 郭子博, 刘凯, 胡航天, 李奕铎, 璩泽旭. 一种微指令序列调度数据流的星载卷积神经网络FPGA加速器[J]. 计算机学报, 2022, 45(10): 2047—2064. [28] Hubel D H, Wiesel T N. Receptive fields and functional architecture of monkey striate cortex[J]. The Journal of Physiology, 1968, 195(1): 215—243. [29] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278—2324. [30] Gu J, Wang Z, Kuen J, et al. Recent advances in convolutional neural networks[J]. Pattern Recognition, 2018, 77: 354—377. [31] Shang W, Sohn K, Almeida D, et al. Understanding and improving convolutional neural networks via concatenated rectified linear units[C]// Proceedings of The 33rd International Conference on Machine Learning. New York: PMLR, 2016: 2217—2225. [32] 侯泽洲, 陈少真, 任炯炯. 深度学习在分组密码差分区分器上的研究应用[J]. 软件学报, 2022, 33(05): 1893—1906. [33] Daubechies I, devore R, Foucart S, et al. Nonlinear Approximation and (Deep) relu Networks[J]. Constructive Approximation, 2022, 55: 127—172. [34] 陈大千, 张凡, 郝鹏翼, 吴福理, 董天阳. 结合多尺度通道注意力和边界增强的2D医学图像分割[J]. 计算机辅助设计与图形学学报, 2022, 34(11): 1742—1752. [35] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770—778. [36] 刘洋, 康健, 管海燕, 汪汉云. 基于双注意力残差网络的高分遥感影像道路提取模型[J]. 地球信息科学学报, 2023, 25(02): 396—408. [37] Creswell A, White T, Dumoulin V, et al. Generative adversarial networks: An overview[J]. IEEE Signal Processing Magazine, 2018, 35(1): 53—65. [38] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139—144. [39] Mirza M, Osindero S. Conditional generative adversarial nets[OL]. [2023-03-02]. Http://arxiv.org/abs/1411.1784. [40] Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein gans[OL]. [2023-03-02]. http://arxiv.org/abs/1704.00028. [41] Wu Z, Pan S, Chen F, et al. A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(1): 4—24. [42] Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering[OL]. [2023-03-02]. http://arxiv.org/abs/1606.09375. [43] 吴博, 梁循, 张树森, 徐睿. 图神经网络前沿进展与应用[J]. 计算机学报, 2022, 45(01): 35—68. [44] 惠子薇, 何坤, 冯犇, 苏曜. 基于视觉特性的图像质量评价[J/OL]. 计算机工程: 1—7. [2023-03-02]. Https://doi.org/10.19678/j.issn.1000-3428.0064923. [45] Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600—612. [46] 茆诗松. 概率论与数理统计教程[M]. 北京: 高等教育出版社, 2004. [47] 邱小霞, 鲍华, 高国庆, 张莹, 何春元, 李淑琪. 无参考自适应光学图像质量评价[J]. 电光与控制, 2023, 30(03): 48—53. [48] Zhang R, Isola P, Efros A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 586—595. [49] Heusel M, Ramsauer H, Unterthiner T, et al. Gans trained by a two time-scale update rule converge to a local nash equilibrium[J]. Advances in Neural Information Processing Systems, 2017, 30: 6629—6640. [50] Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2017: 6230—6239. [51] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2017: 2117—2125. [52] Chen C F, Fan Q, Mallinar N, et al. Big-little net: An efficient multi-scale feature representation for visual and speech recognition[OL]. [2023-03-02]. Https://arxiv.org/abs/1807.03848. [53] Meng Y, Lin C, Panda R, et al. Ar-net: Adaptive frame resolution for efficient action recognition[C]// Proceedings of the European Conference on Computer Vision. Cham: Springer, 2020: 86—104. [54] Isola P, Zhu J Y, Zhou T H, et al. Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2017: 5967—5976. [55] Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1—9. [56] Xiao X, Lian S, Luo Z, et al. Weighted res-unet for high-quality retina vessel segmentation[C]//2018 9th International Conference on Information Technology in Medicine and Education. Los Alamitos: IEEE Computer Society Press, 2018: 327—331. [57] Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation[C] //Proceedings of the IEEE International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2015: 1520—1528. [58] Badrinarayanan V, Kendall A, Cipolla R. Segnet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481—2495. [59] Ibtehaz N, Rahman M S. Multiresunet: Rethinking the U-Net architecture for multimodal biomedical image segmentation[J]. Neural Networks, 2020, 121: 74—87. [60] Rad M S, Bozorgtabar B, Marti U V, et al. Srobb: Targeted perceptual loss for single image super-resolution[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2019: 2710—2719. [61] Chen L, Yang Z, Ma J, et al. Driving scene perception network: Real-time joint detection, depth estimation and semantic segmentation[C]// Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Los Alamitos: IEEE Computer Society Press, 2018: 1283—1291. [62] Gatys L A, Ecker A S, Bethge M. Image style transfer using convolutional neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Los Alamitos: IEEE Computer Society Press, 2016: 2414—2423. [63] Anwar S, Tahir M, Li C, et al. Image colorization: A survey and dataset[OL]. [2023-03-02]. https://arxiv.org/abs/2008.10774. [64] Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE international Conference on Computer Vision. Piscataway: IEEE, 2017: 2242—2251. [65] Nazeri K, Ng E, Ebrahimi M. Image colorization using generative adversarial networks[C]//Proceedings of the International Conference on Articulated Motion and Deformable Objects. Cham: Springer, 2018: 85—94. [66] Vitoria P, Raad L, Ballester C. Chromagan: adversarial picture colorization with semantic class distribution[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Los Alamitos: IEEE Computer Society Press, 2020: 2434—2443. [67] 李洪安, 郑峭雪, 张婧, 杜卓明, 李占利, 康宝生. 结合Pix2Pix生成对抗网络的灰度图像着色方法[J]. 计算机辅助设计与图形学学报, 2021, 33(6), 929—938. [68] Kim G, Kang K, Kim S, et al. Bigcolor: Colorization using a Generative Color Prior for Natural Images[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2022: 350—366. [69] Ouyang Y, Rao Y, Zhang D, et al. Cartoon Colorization with Gray Image Generated from Sketch[C]//Proceedings of the 2021 4th International Conference on Pattern Recognition and Artificial Intelligence (PRAI). Piscataway: IEEE, 2021: 70—74. [70] Sugawara M, Uruma K, Hangai S, et al. Local and global graph approaches to image colorization[J]. IEEE Signal Processing Letters, 2020, 27: 765—769. [71] Lezoray O, Ta V T, Elmoataz A. Nonlocal graph regularization for image colorization[C]//Proceedings of the 2008 19th International Conference on Pattern Recognition. Piscataway: IEEE, 2008: 1—4. [72] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks[OL]. [2023-03-02]. Https://arxiv.org/abs/1609.02907. [73] Valsesia D, Fracastoro G, Magli E. NIR image colorization with graph-convolutional neural networks[C]//Proceedings of the 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP). Piscataway: IEEE, 2020: 451—454. [74] Hamilton W, Ying Z, Leskovec J. Inductive representation learning on large graphs[OL]. [2023-03-02]. http://arxiv.org/abs/1706.02216. [75] Wu Z, Pan S, Chen F, et al. A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(1): 4—24. [76] Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2016: 694—711. [77] Zhao Y, Po L M, Cheung K W, et al. SCGAN: Saliency map-guided colorization with generative adversarial network[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31(8): 3062—3077. [78] 李洪安, 郑峭雪, 马天, 张婧, 李占利, 康宝生. 多视野特征表示的灰度图像彩色化方法[J]. 模式识别与人工智能, 2022, 35(07): 637—648. ﹀
中图分类号：	TP391.4
开放日期：	2023-06-13

附件下载