查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于深度学习的红外与可见光图像融合算法研究
姓名：	王慧敏
学号：	21207040028
保密级别：	公开
论文语种：	chi
学科代码：	081002
学科名称：	工学 - 信息与通信工程 - 信号与信息处理
学生类型：	硕士
学位级别：	工学硕士
学位年度：	2024
培养单位：	西安科技大学
院系：	通信与信息工程学院
专业：	信息与通信工程
研究方向：	图像处理
第一导师姓名：	王书朋
第一导师单位：	西安科技大学
论文提交日期：	2024-06-11
论文答辩日期：	2024-05-29
论文外文题名：	Research on Deep Learning-based Infrared and Visible Image Fusion Algorithm
论文中文关键词：	图像融合 ; 红外图像 ; 可见光图像 ; 深度学习 ; 注意力机制 ; 密集连接
论文外文关键词：	Image fusion ; Infrared image ; Visible image ; Deep learning ; Attention mechanism ; Dense connection
论文中文摘要：	︿红外与可见光图像融合技术旨在从同一场景不同传感器获得的图像中提取有效信息并进行整合，从而生成场景信息更丰富、更全面的单幅图像，有利于增强人类对于场景的解读，并为后续的目标检测、识别等任务提供了便利。基于传统的融合方法依赖于特定的变换模型提取图像特征，融合策略需要手工设计，模型整体的泛化能力不强。而基于深度学习的融合方法利用卷积核的强大特征提取能力获取图像特征，通过损失函数设计指导网络完成融合，在一定程度上克服了传统方法的限制。因此，本文重点研究基于深度学习的红外与可见光图像融合方法，主要内容如下：针对融合图像中纹理细节丢失，边缘模糊的问题，本文提出了一种基于多尺度信息提取的红外与可见光图像融合算法。首先，在特征提取网络中引入金字塔挤压注意力机制，可以有效地捕获和利用不同尺度特征图的空间信息，同时建立长期的通道依赖关系。其次，设计了梯度补偿模块，采用跳跃连接将浅层特征的梯度信息级联到特征提取网络的最后一层，有效增强了融合图像的纹理细节信息。最后采用像素强度损失和梯度损失约束网络进行训练。在公共数据集上的实验结果表明：对比其他先进的融合方法，本文所提算法获得的融合图像纹理细节更为丰富。针对融合图像中红外目标被弱化、对比度不足的问题，本文提出一种基于改进生成对抗网络的红外与可见光图像融合算法。首先，在生成器网络中采用密集连接，加强了对浅层特征的重复利用，有效防止了特征信息的丢失。其次，将红外图像或可见光图像通过跳跃连接输入网络中间层，增加了融合图像中的热辐射信息和纹理细节信息。然后，引入CBAM注意力机制对每一个卷积块提取到的特征进行细化，增强了网络特征编码的能力。最后，采用主次思想设计内容损失以约束生成器网络，使其能够以互补的方式从源图像中提取更充分的信息。大量的实验结果证明：本文算法在TNO、RoadScene、MSRS数据集上的主观视觉评价明显优于其他先进的融合算法，在客观评价指标EN、SD、SF、Q_abf上均能取得最优值。﹀
论文外文摘要：	︿ Infrared and visible image fusion technology focuses on extracting valuable information from images acquired by multiple sensors of the same scene and combing them into a single image with wider and more complementary scene information, enhancing human interpretation of the scene and facilitating subsequent tasks like target detection and recognition. Conventional fusion methods rely on a specific transform model to extract image features, the fusion strategy needs to be designed manually, the overall generalization ability of the model is not high. Deep learning based fusion methods utilize the powerful feature extraction capability of convolutional kernel to obtain image features, and guide the network to complete fusion through loss function design, which overcome the limitations of traditional methods to a certain extent. Therefore, the infrared and visible image fusion method based on deep learning is the focus in this thesis, and the essential contributions are as follows: To overcome the problems of texture detail loss and edge blurring in the fused images, an infrared and visible image fusion methodology that relies on multi-scale information extraction is presented in this thesis. First, a pyramid squeezing attention mechanism is introduced into the feature extraction network, which can effectively capture and utilize the spatial information of different scale feature maps, while establishing long-term channel dependency. Secondly, a gradient compensation module is designed to cascade the gradient information of shallow features to the last layer of the feature extraction network using jump connections, which effectively enhances the texture detail information of the fused image. Lastly, pixel intensity loss and gradient loss limited networks are employed for training. The evaluation results on common datasets indicate that the fused image texture details achieved by the algorithm suggested in this paper are richer in comparison with other state-of-the-art fusion methods. Aiming at the challenge that the infrared target in the fused image is weak and the contrast is not sufficient, an infrared and visible light image fusion algorithm based on an improved generative adversarial network is proposed in this thesis. First, a dense connection is employed in the generator network to enhance the reuse of surface features and effectively avoid the loss of feature information. Second, the infrared image or visible image is input into the middle layer of the network by jump connection, which increases the information of the thermal radiation and the information of the texture details in the fused image. Then, the CBAM attention mechanism is introduced to refine the features extracted from each convolutional block, which enhances the ability of network feature encoding. Finally, the primary-secondary idea is used to design the content loss to constrain the generator network so that it can extract more adequate information from the source image in a complementary way. Numerous experimental evidences show that the subjective visual evaluation of the method in this thesis at TNO, RoadScene, and MSRS datasets is significantly better than that of other state-of-the-art fusion algorithms, and that it can achieve the optimal values on the objective evaluation metrics EN, SD, SF, and Q_abf. ﹀
参考文献：	︿ [1]Prabhakar K R, Srikar V S, Babu R V. DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs[C]//Proceedings of the IEEE international conference on computer vision. 2017: 4714-4722. [2]Liu Y, Chen X, Peng H, et al. Multi-focus image fusion with a deep convolutional neural network[J]. Information Fusion, 2017, 36: 191–207. [3]Yang B, Zhong J, Li Y, et al. Multi-focus image fusion and super-resolution with convolutional neural network[J]. International Journal of Wavelets, Multiresolution and Information Processing, 2017, 15(04): 1750037. [4]Saleh M A, Ali A , Ahmed K, et al. A Brief Analysis of Multimodal Medical Image Fusion Techniques[J]. Electronics, 2022, 12(1): 97. [5]Liu S, Wang M, Yin L, et al. Two-Scale Multimodal Medical Image Fusion Based on Structure Preservation[J]. Frontiers in Computational Neuroscience, 2022, 15: 803724. [6]武圆圆, 王志社, 王君尧, 等. 红外与可见光图像注意力生成对抗融合方法研究[J]. 红外技术, 2022, 44(02): 170–178. [7]王君尧, 王志社, 武圆圆, 等. 红外与可见光图像多特征自适应融合方法[J]. 红外技术, 2022, 44(06): 571–579. [8]Scarpa G, Vitale S, Cozzolino D. Target-adaptive CNN-based pansharpening[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(9): 5443–5457. [9]Ma J, Ma Y, Li C. Infrared and visible image fusion methods and applications: A survey[J]. Information Fusion, 2019, 45: 153–178. [10]张冬冬, 王春平, 付强. 深度学习框架下的红外与可见光图像融合算法综述[J]. 激光与红外, 2022, 52(09): 1288–1298. [11]沈英, 黄春红, 黄峰, 等. 红外与可见光图像融合技术的研究进展[J]. 红外与激光工程 , 2021, 50(9): 20200467. [12]唐霖峰, 张浩, 徐涵, 等. 基于深度学习的图像融合方法综述[J]. 中国图象图形学报, 2023, 28(1): 3–36. [13]Selvaraj A, Ganesan P. Infrared and visible image fusion using multi-scale NSCT and rolling-guidance filter[J]. IET Image Processing, 2020, 14(16): 4210–4219. [14]Li H, Liu L, Huang W, et al. An improved fusion algorithm for infrared and visible images based on multi-scale transform[J]. Infrared Physics & Technology, 2016, 74: 28-37. [15]Kong X, Liu L, Qian Y, et al. Infrared and visible image fusion using structure-transferring fusion method[J]. Infrared Physics & Technology, 2019, 98: 161–173. [16]Liu Y, Chen X, Ward R K, et al. Image Fusion With Convolutional Sparse Representation[J]. IEEE Signal Processing Letters, 2016, 23(12): 1882–1886. [17]Wu M, Ma Y, Fan F, et al. Infrared and visible image fusion via joint convolutional sparse representation[J]. JOSA A, 2020, 37(7): 1105-1115. [18]张梦. 基于低秩稀疏表示的红外与可见光图像融合算法研究与应用[D]. 扬州大学, 2024. [19]Guo Z, Yu X, Du Q. Infrared and visible image fusion based on saliency and fast guided filtering[J]. Infrared Physics & Technology, 2022, 123: 104178. [20]Zhao J, Gao X, Chen Y, et al. Multi-window visual saliency extraction for fusion of visible and infrared images[J]. Infrared Physics & Technology, 2016, 76: 295–302. [21]Ma J, Chen C, Li C, et al. Infrared and visible image fusion via gradient transfer and total variation minimization[J]. Information Fusion, 2016, 31: 100–109. [22]Burt P J, Adelson E H. The Laplacian Pyramid as a Compact Image Code[J]. IEEE Transactions on Communications, 1983, 31(4): 532–540. [23]Toet A. Image fusion by a ratio of low-pass pyramid[J]. Pattern Recognition Letters, 1989, 9(4): 245–253. [24]Nikolov S, Hill P, Bull D, et al. Wavelets for Image Fusion[J]. Wavelets in Signal and Image Analysis: From Theory to Practice, 2001: 213-241. [25]Aishwarya N, Bennila Thangammal C. Visible and infrared image fusion using DTCWT and adaptive combined clustered dictionary[J]. Infrared Physics & Technology, 2018, 93: 300–309. [26]Quan S, Qian W, Guo J, et al. Visible and infrared image fusion based on Curvelet transform[C] //The 2014 2nd International Conference on Systems and Informatics (ICSAI 2014). IEEE, 2014: 828-832. [27]He G, Xing S, He X, et al. Image fusion method based on simultaneous sparse representation with non‐subsampled contourlet transform[J]. IET Computer Vision, 2019, 13(2): 240-248. [28]Xing X, Liu C, Luo C, et al. Infrared and Visible Image Fusion Based on Nonlinear Enhancement and NSST Decomposition[J]. EURASIP Journal on Wireless Communications and Networking, 2020, 2020: 1-17. [29]Shutao Li, Xudong Kang, Jianwen Hu. Image Fusion With Guided Filtering[J]. IEEE Transactions on Image Processing, 2013, 22(7): 2864–2875. [30]Yang B, Li S. Pixel-level image fusion with simultaneous orthogonal matching pursuit[J]. Information Fusion, 2012, 13(1): 10–19. [31]Li H, Wu X J, Kittler J. MDLatLRR: A novel decomposition method for infrared and visible image fusion[J]. IEEE Transactions on Image Processing, 2020, 29: 4733–4746. [32]冯鑫. 基于内生长机制和卷积稀疏表示的红外与可见光图像融合[J]. 控制与决策, 2022, 37(1): 167–174. [33]Zhang B, Lu X, Pei H, et al. A fusion algorithm for infrared and visible images based on saliency analysis and non-subsampled Shearlet transform[J]. Infrared Physics & Technology, 2015, 73: 286–297. [34]Zheng Y, Essock E A, Hansen B C. An advanced image fusion algorithm based on wavelet transform: incorporation with PCA and morphological processing[C]//Image processing: algorithms and systems. SPIE, 2004, 5298: 177-187. [35] Lu Y, Wang F, Luo X, et al. Novel infrared and visible image fusion method based on independent component analysis[J]. Frontiers of Computer Science, 2014, 8: 243-254. [36]Zhang J, Wei L, Miao Q, et al. Image fusion based on nonnegative matrix factorization[C]//2004 International Conference on Image Processing, 2004. ICIP'04. IEEE, 2004, 2: 973-976. [37]冀鲸宇, 张玉华, 邢娜, 等. 三尺度分解和稀疏表示的红外和可见光图像融合[J]. 光谱学与光谱分析, 2024, 44(05): 1425-1438. [38]牛振华, 邢延超, 林英超, 等. 基于NSCT结合显著图与区域能量的红外与可见光图像融合[J]. 红外技术, 2024, 46(01): 84-93. [39]Zhang S, Huang F, Zhong H, et al. Multi-modal image fusion via sparse representation and multi-scale anisotropic guided measure[J]. IEEE Access, 2020, 8: 35638-35649. [40]高晓宇. 基于卷积神经网络的红外与可见光图像融合算法[D]. 兰州交通大学, 2023. [41]徐东东. 基于无监督深度学习的红外与可见光图像融合方法研究[D]. 中国科学院大学(中国科学院长春光学精密机械与物理研究所), 2020. [42]Zhang H, Le Z, Shao Z, et al. MFF-GAN: An unsupervised generative adversarial network with adaptive and gradient joint constraints for multi-focus image fusion[J]. Information Fusion, 2021, 66: 40-53. [43]Wang Z, Shao W, Chen Y, et al. Infrared and visible image fusion via interactive compensatory attention adversarial learning[J]. IEEE Transactions on Multimedia, 2022. [44]Fu Y, Wu X-J, Durrani T. Image fusion based on generative adversarial network consistent with perception[J]. Information Fusion, 2021, 72: 110–125. [45]洪雨露, 吴小俊, 徐天阳. 基于差异双分支编码器的多阶段图像融合方法[J]. 模式识别与人工智能, 2022, 35(07): 661-670. [46]冯鑫, 杨杰铭, 张鸿德, 等. 基于双通道残差密集网络的红外与可见光图像融合[J]. 光子学报, 2023, 52(11): 285-296. [47]Cao Y, Luo X, Tong X, et al. Infrared and visible image fusion based on a two-stage class conditioned auto-encoder network[J]. Neurocomputing, 2023, 544: 126248. [48]Pan Y, Pi D, Khan I A, et al. DenseNetFuse: A study of deep unsupervised DenseNet to infrared and visual image fusion[J]. Journal of Ambient Intelligence and Humanized Computing, 2021: 1-13. [49]Xu H, Ma J, Jiang J, et al. U2Fusion: A Unified Unsupervised Image Fusion Network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(1): 502–518. [50]Zhang H, Xu H, Xiao Y, et al. Rethinking the image fusion: A fast unified image fusion network based on proportional maintenance of gradient and intensity[C]//Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 12797-12804. [51]Tang L, Yuan J, Ma J. Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network[J]. Information Fusion, 2022, 82: 28–42. [52]Ma J, Tang L, Xu M, et al. STDFusionNet: An Infrared and Visible Image Fusion Network Based on Salient Target Detection[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 1–13. [53]Tang L, Yuan J, Zhang H, et al. PIAFusion: A progressive infrared and visible image fusion network based on illumination aware[J]. Information Fusion, 2022, 83–84: 79–92. [54]Goodfellow I, Pouget-Abadie J, Mirza M. Generative Adversarial Nets[J]. Advances in neural information processing systems, 2014, 27. [55]Ma J, Yu W, Liang P, et al. FusionGAN: A generative adversarial network for infrared and, visible image fusion[J]. Information Fusion, 2019, 48: 11–26. [56]Ma J, Liang P, Yu W, et al. Infrared and visible image fusion via detail preserving adversarial learning[J]. Information Fusion, 2020, 54: 85-98. [57]Ma J, Xu H, Jiang J, et al. DDcGAN: A Dual-Discriminator Conditional Generative Adversarial Network for Multi-Resolution Image Fusion[J]. IEEE Transactions on Image Processing, 2020, 29: 4980–4995. [58]Hou J, Zhang D, Wu W, et al. A generative adversarial network for infrared and visible image fusion based on semantic segmentation[J]. Entropy, 2021, 23(3): 376. [59]Ma J, Zhang H, Shao Z, et al. GANMcC: A Generative Adversarial Network With Multiclassification Constraints for Infrared and Visible Image Fusion[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 1–14. [60]Li H, Wu X-J. DenseFuse: A Fusion Approach to Infrared and Visible Images[J]. IEEE Transactions on Image Processing, 2019, 28(5): 2614–2623. [61]Li H, Wu X-J, Durrani T. NestFuse: An Infrared and Visible Image Fusion Architecture Based on Nest Connection and Spatial/Channel Attention Models[J]. IEEE Transactions on Instrumentation and Measurement, 2020, 69(12): 9645–9656. [62]Li H, Wu X-J, Kittler J. RFN-Nest: An end-to-end residual fusion network for infrared and visible images[J]. Information Fusion, 2021, 73: 72–86. [63]Wang Z, Wu Y, Wang J, et al. Res2Fusion: Infrared and Visible Image Fusion Based on Dense Res2net and Double Nonlocal Attention Models[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1–12. [64]李碧草, 卢佳熙, 刘洲峰, 等. 基于Swin Transformer和混合特征聚合的红外与可见光图像融合方法[J]. 红外技术, 2023, 45(7): 721–731. [65]Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278–2324. [66]Goodfellow I, Pouget-Abadie J, Mirza M. Generative Adversarial Nets[J]. Advances in neural information processing systems, 2014, 27. [67]Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv preprint arXiv:1511.06434, 2015. [68]Arjovsky M, Chintala S, Bottou L. Wasserstein generative adversarial networks[C]//Intern-ational conference on machine learning. PMLR, 2017: 214-223. [69]Mao X, Li Q, Xie H, et al. Least squares generative adversarial networks[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2794-2802. [70]Mnih V, Heess N, Graves A. Recurrent models of visual attention[J]. Advances in neural information processing systems, 2014, 27. [71]Hu J, Shen L, Sun G. Squeeze-and-Excitation Networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141. [72]Hu J, Shen L, Albanie S, et al. Gather.excite: Exploiting feature context in convolutional neural networks[J]. Advances in neural information processing systems, 2018, 31. [73]Woo S, Park J, Lee J-Y, et al. CBAM: Convolutional Block Attention Module [C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19. [74]Fu J, Liu J, Tian H, et al. Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 3146-3154. [75]Toet A. The TNO Multiband Image Data Collection[J]. Data in Brief, 2017, 15: 249–251. [76]Xu H, Ma J, Le Z, et al. FusionDN: A Unified Densely Connected Network for Image Fusion[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(07): 12484–12491. [77]Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278–2324. [78]Zhang H, Zu K, Lu J, et al. EPSANet: An Efficient Pyramid Squeeze Attention Block on Convolutional Neural Network [C]//Proceedings of the Asian Conference on Computer Vision. 2022: 1161-1177. [79]Ding K, Li X, Guo W, et al. Improved object detection algorithm for drone-captured dataset based on yolov5[C]//2022 2nd International Conference on Consumer Electronics and Computer Engineering (ICCECE). IEEE, 2022: 895-899. [80]Huang G, Liu Z, Van Der Maaten L, et al. Densely Connected Convolutional Networks[C]. // Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4700-4708. [81]He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C] //IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016: 770–778. [82]Fu J, Liu J, Tian H, et al. Dual Attention Network for Scene Segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 3146-3154. [83]Toet A. The TNO Multiband Image Data Collection[J]. Data in Brief, 2017, 15: 249–251. [84]Xu H, Ma J, Le Z, et al. FusionDN: A Unified Densely Connected Network for Image Fusion[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(07): 12484–12491. [85]Tang L, Yuan J, Zhang H, et al. PIAFusion: A progressive infrared and visible image fusion network based on illumination aware[J]. Information Fusion, 2022, 83–84: 79–92. [86]Li H, Wu X-J, Kittler J. RFN-Nest: An end-to-end residual fusion network for infrared and visible images[J]. Information Fusion, 2021, 73: 72–86. [87]Xu H, Ma J, Jiang J, et al. U2Fusion: A unified unsupervised image fusion network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 44(1): 502-518. ﹀
中图分类号：	TP391.4
开放日期：	2024-06-12

附件下载