论文中文题名: | 基于Transformer的图像增强方法研究 |
姓名: | |
学号: | 20208223069 |
保密级别: | 保密(1年后开放) |
论文语种: | chi |
学科代码: | 085400 |
学科名称: | 工学 - 电子信息 |
学生类型: | 硕士 |
学位级别: | 工程硕士 |
学位年度: | 2023 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 可视化技术及应用 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2023-06-13 |
论文答辩日期: | 2023-06-05 |
论文外文题名: | Study on Image Enhancement Method based on Transformer |
论文中文关键词: | 图像增强 ; 低光照图像 ; Transformer ; 生成对抗网络 ; 曲线调整 |
论文外文关键词: | Image enhancement ; low-light image ; Transformer ; generative adversarial network ; curve adjustment |
论文中文摘要: |
图像是人类获取信息的重要途径,但是由于光照条件和设备性能等的限制,会导致所拍摄图像存在过暗、细节不清晰、颜色失真等问题,因此,采用图像增强技术进行处理是必要的。本文以图像增强领域的两个子问题:图像修饰和低光照图像增强为研究对象。针对目前已有的图像增强方法存在网络模型设计复杂、难以处理光照不均衡图像、训练数据受限等问题,从全监督和半监督的角度出发,分别提出了两种基于Transformer网络架构的图像增强方法,以解决上述问题。 本文的主要工作和创新点如下: (1)针对图像增强网络结构设计复杂导致实时性较差的问题,提出了一种基于曲线调整的Transformer全监督图像增强方法(Transformer Photo Enhancer, TPE)。TPE采用Transformer网络架构作为编码器的主干网络,并结合两阶段曲线调整函数,以实现图像增强。首先,编码器通过轻量化的自注意力机制获取调整参数,实现图像到调整参数的映射,从而实现在任意分辨率下的图像增强,并提高超大分辨率图像处理的实时性。同时,通过分析Transformer超参数设置对增强结果的影响,设计了一种轻量化模型,以此来提高模型的效率。其次,两阶段曲线调整策略增强了曲线调整函数的调整能力,第二阶段曲线调整用于对第一阶段增强结果进行微调,使得该方法兼具全局增强与局部微调的能力。最后,在MIT Adobe FiveK和LOL数据集上进行了定量和定性实验分析,结果表明该方法在PSNR、SSIM、LPIPS等评价指标上提升显著,能够有效提高图像的亮度和对比度,并恢复前景和背景中更多细节。 (2)针对成对数据集获取成本较高、光照分布不均衡图像增强难度较大和单一图像分块策略容易产生十字形伪影的问题,提出了一种基于生成对抗网络结合Transformer的半监督图像增强方法(Semi-supervised TransGAN Image Enhancer, STGIE)。STGIE采用Transformer网络架构作为生成对抗网络的主干网络,并通过曲线调整函数来增强图像质量和细节。首先,使用生成对抗网络通过非成对数据集进行半监督学习,以克服成对数据集获取困难的问题。其次,使用灰度图作为生成器网络的光照注意力图,以平衡增强结果在不同区域的曝光水平。最后,为避免单一图像分块策略形成固定分割边界,采用生成器和鉴别器网络交叉使用均等裁剪和滑动窗口裁剪策略的方法,以增强网络提取特征的能力并解决十字形伪影问题。此外,为了提高生成器对图像细节的感知能力,引入了一种重建损失,用于辅助生成器生成更加真实和自然的图像。在MIT Adobe FiveK、LOL、NPE和MEF等多种数据集进行了定量和定性实验分析,结果表明该方法在NIQE和用户主观评分两个评价指标上提升显著,并且对图像的亮度和色彩调整更加真实自然,特别是在处理光照不均衡的图像方面表现更好。 综上所述,本文研究了基于Transformer网络架构的图像增强方法,为光照不足和光照失衡情况下的图像增强方法提供了新的思路和方法。此外,通过曲线调整函数可以增强图像细节信息和色彩分布,从而提高了图像的可读性和识别度,更好地满足实际应用的需求。 |
论文外文摘要: |
Images are an important means for humans to obtain information, but due to limitations such as lighting conditions and equipment performance, captured images may have problems such as being too dark, unclear details, and color distortion, which require image enhancement techniques for repair. This paper focuses on two sub-problems in the field of image enhancement: image retouching and low-light image enhancement. In response to the problems of complex network model design, difficulty in processing unevenly lit images, and limited training data in existing image enhancement methods, two Transformer network-based image enhancement methods are proposed from the perspectives of full supervision and semi-supervision to solve these problems. The main contributions and innovations of this paper are as follows: (1) To address the problem of poor real-time performance caused by complex image enhancement network structure design, a Transformer-based full-supervision image enhancement method (Transformer Photo Enhancer, TPE) is proposed based on curve adjustment. TPE uses the Transformer network architecture as the backbone of the encoder, and combines a two-stage curve adjustment function to achieve image enhancement. Firstly, the encoder uses a lightweight self-attention mechanism to obtain adjustment parameters and maps the image to the adjustment parameters, thereby achieving image enhancement at any resolution and improving the real-time performance of processing ultra-high-resolution images. Meanwhile, by analyzing the impact of Transformer hyperparameter settings on enhancement results, a lightweight model is designed to improve model efficiency. Secondly, the two-stage curve adjustment strategy enhances the adjustment ability of the curve adjustment function, and the second-stage curve adjustment is used to fine-tune the first-stage enhancement results, making this method capable of both global enhancement and local fine-tuning. Finally, quantitative and qualitative experiments on the MIT Adobe FiveK and LOL datasets show that this method significantly improves evaluation metrics such as PSNR, SSIM, and LPIPS, effectively improves image brightness and contrast, and restores more details in the foreground and background. (2) To address the problems of high cost for acquiring paired datasets, difficulty in enhancing unevenly lit images, and the tendency for a single image partitioning strategy to produce cross-shaped artifacts, a semi-supervised image enhancement method (Semi-supervised TransGAN Image Enhancer, STGIE) based on a combination of generative adversarial networks and Transformer is proposed. STGIE uses the Transformer network architecture as the backbone of the generative adversarial network and uses curve adjustment functions to enhance image quality and details. Firstly, a generative adversarial network is used for semi-supervised learning with non-paired datasets to overcome the difficulty of obtaining paired datasets. Secondly, a grayscale image is used as the illumination attention map of the generator network to balance the exposure levels of the enhanced results in different regions. Finally, to avoid fixed segmentation boundaries formed by a single image partitioning strategy, a method of using equal cropping and sliding window cropping strategies for the generator and discriminator networks is adopted to enhance the feature extraction ability of the network and solve the problem of cross-shaped artifacts. In addition, to improve the generator's perception of image details, a reconstruction loss is introduced to assist the generator in generating more realistic and natural images. Quantitative and qualitative experiments on multiple datasets such as MIT Adobe FiveK, LOL, NPE, and MEF show that this method significantly improves evaluation metrics such as NIQE and user subjective scores, and the brightness and color adjustments of the images are more realistic and natural, especially in processing unevenly lit images. In summary, this paper studies image enhancement methods based on the Transformer network architecture, providing new ideas and methods for image enhancement in situations of insufficient or imbalanced lighting. In addition, curve adjustment functions can enhance image detail information and color distribution, thereby improving the readability and recognizability of images and better meeting the needs of practical applications. |
参考文献: |
[1]江泽涛, 覃露露, 秦嘉奇, 张少钦. 一种基于MDARNet的低照度图像增强方法[J]. 软件学报, 2021, 032(012): 3977–3991. [2]卢宏涛, 张秦川. 深度卷积神经网络在计算机视觉中的应用研究综述[J]. 数据采集与处理, 2016, 31(01): 1–17. [3]胡琼, 汪荣贵, 胡韦伟. 基于直方图分割的彩色图像增强算法[J]. 中国图象图形学报, 2009, 14(09): 1776–1781. [6]Zuiderveld K. Contrast Limited Adaptive Histogram Equalization[J]. Graphics Gems, 1994: 474–485. [10]肖春霞, 聂勇伟, 黄先锋. 基于联合双边滤波的纹理合成上采样算法[J]. 计算机学报, 2009, 32(02): 241–251. [16]汪雪林, 韩华, 彭思龙. 基于小波域局部高斯模型的图像复原[J]. 软件学报, 2004, 15(3): 443–450. [17]Li J P, Jing Z, Wickerhauser V, et al. Wavelet Analysis and Its Applications[M]. Springer, 2001. [19]Zuiderveld K. Contrast Limited Adaptive Histogram Equalization[J]. Graphics Gems, 1994: 474–485. [22]张尚伟, 曾平, 罗雪梅. 具有细节补偿和色彩恢复的多尺度Retinex色调映射算法[J]. 西安交通大学学报, 2012, 46(4): 32–37. [25]江泽涛, 伍旭, 张少钦. 一种基于 MR-VAE 的低照度图像增强方法[J]. 计算机学报, 2020, 43(7): 1328–1339. |
中图分类号: | TP391.41 |
开放日期: | 2024-06-13 |