查看论文信息

免费浏览

查看论文信息

论文中文题名：	融合多尺度特征的图像篡改定位算法研究
姓名：	余海波
学号：	22208223054
保密级别：	公开
论文语种：	chi
学科代码：	085400
学科名称：	工学 - 电子信息
学生类型：	硕士
学位级别：	工学硕士
学位年度：	2025
培养单位：	西安科技大学
院系：	人工智能与计算机学院
专业：	计算机技术
研究方向：	计算机视觉
第一导师姓名：	邓凡
第一导师单位：	西安科技大学
论文提交日期：	2025-06-16
论文答辩日期：	2025-05-29
论文外文题名：	Research on image tampering localization algorithm integrating multi-scale features
论文中文关键词：	图像篡改定位 ; 多尺度融合 ; 注意力机制 ; 边界伪影定位 ; 轻量级网络
论文外文关键词：	Image tampering location ; Multi-scale fusion ; Attention mechanism ; Boundary artifact location ; Lightweight network
论文中文摘要：	︿数字图像作为信息传播的核心介质，已深度融入社交媒体、司法取证、医疗诊断及新闻传播等社会关键领域。伴随着图像篡改技术的迅猛发展，图像的可视化特征催生了新型数字安全风险，技术突破不断颠覆“眼见为实”的认知根基，亟需针对图像篡改定位的研究。现有图像篡改定位方法存在局限，大多数研究仅针对特定篡改类型，对复合篡改场景的泛化能力不足，缺乏针对频域特性设计的研究，普遍面临计算复杂度与存储效率的失衡问题，在应用部署场景中尤为突出。本文研究融合多尺度特征的图像篡改定位方法，具体内容如下：（1）针对图像篡改定位方法对复合篡改场景建模不足、感知不充分的问题，本文提出一种多尺度迭代篡改定位算法（Multi-scale Iterative Tamper Detection Network, MITD-Net），旨在充分利用图像的多尺度特征，实现通用场景下对篡改区域的准确定位。将图像篡改定位分解为特征同步和区域细化两个阶段，采用并行网络架构捕获不同尺度特征信息。设计像素级特征聚类模块，集成跨尺度的局部特征和全局上下文信息，捕获空间和通道维度特征相关性。为解决篡改尺度变化问题，构建增强下采样注意块，设计边缘增强模块处理边界信息。MITD-Net适用于各类复杂场景，通过在Columbia、CASIA等数据集上的实验结果验证其效果。（2）针对图像篡改定位频域特征提取不足、压缩特性考虑不充分的问题，本文提出一种图像拼接篡改定位的伪影识别与追踪算法（Artifact Recognition and Tracing Network for Image Splicing Detection, ART-Net），旨在突破传统方法单域定位的局限性。为充分利用频域与图像特征，构建双域协同分析，设计具有空-频联合感知能力的JPEG伪影学习模块，建立压缩痕迹与视觉伪影的关联映射，提升跨格式压缩痕迹的泛化识别能力。ART-Net方法在各类压缩环境中表现出色，通过在IMD20和Spliced COCO等拼接数据集上的实验结果验证该方法的有效性。（3）针对图像篡改定位长期以来一直依赖于微观特征、参数量大且定位效率低的问题，本文提出一种介观尺度轻量化篡改定位算法（Mesoscopic Lightweight Tampering Localization Network, MLT-Net），旨在突破传统定位任务依赖单一特征的局限。构建多尺度特征协同预测模块，基于宏观与微观级别提取高频和低频特征，动态调整缩放权重以突出介观尺度。针对网络结构的定位效率问题，提出动态可适跨粒度加权模块，应用剪枝以优化参数，通过在Coverage、Columbia、NIST16和CASIA等数据集上验证MLT-Net的良好性能。为解决复杂篡改图像识别能力不足、特征提取不充分、模型复杂的问题，本文研究融合多尺度特征的图像篡改定位方法。在篡改内容泛滥的数字化社会中，有助于遏制虚假图像的病毒式扩散，保障多模态数据的完整性，重建数字信任基石，形成网络空间治理的新秩序。﹀
论文外文摘要：	︿ Digital images, as a core medium for information dissemination, have deeply integrated into key social fields such as social media, judicial forensics, medical diagnosis, and news communication. With the rapid development of image manipulation technologies, the visual characteristics of images have introduced new digital security risks. Technological breakthroughs continuously challenge the notion of "seeing is believing," urgently requiring research on image tampering localization. Existing methods still face limitations: most studies focus on specific manipulation types with weak generalization in composite tampering scenarios, lack frequency-domain feature designs, and struggle with balancing computational complexity and storage efficiency, especially in practical deployments. This research explores image tampering localization through multi-scale feature fusion, with specific contributions as follows: (1) To address insufficient modeling and perception of composite tampering scenarios, we propose a Multi-scale Iterative Tamper Detection Network (MITD-Net). This network leverages multi-scale features for accurate localization in general scenarios through a two-stage process: feature synchronization and region refinement. It employs parallel network architecture to capture multi-scale features, integrates cross-scale local features and global context via a pixel-level feature clustering module, and addresses scale variations using an enhanced downsampling attention block. An edge enhancement module improves boundary processing. MITD-Net demonstrates strong performance on Columbia and CASIA datasets across complex scenarios. (2) For insufficient frequency-domain analysis and compression characteristics consideration, we develop an Artifact Recognition and Tracing Network (ART-Net) for image splicing detection. This method breaks traditional single-domain limitations through RGB-DCT dual-domain analysis. A JPEG artifact learning module with spatial-frequency joint perception establishes mappings between compression traces and visual artifacts, enhancing cross-format generalization. ART-Net shows excellent performance on IMD20 and Spliced COCO datasets under various compression conditions. (3) To overcome traditional methods' reliance on micro-features with high parameters and low efficiency, we propose a Mesoscopic Lightweight Tampering Localization Network (MLT-Net). It integrates multi-scale features through macro-micro feature extraction and dynamic scaling weights to emphasize mesoscopic characteristics. A dynamic cross-granularity weighting module and network pruning optimize parameter efficiency. MLT-Net achieves strong performance on Coverage, Columbia, NIST16, and CASIA datasets. To address the challenges of insufficient complex tampering detection, inadequate feature extraction, and overly complex models, this study achieves digital image tampering localization by integrating multi-scale features. In today’s digital society plagued by manipulated content, this work helps curb the viral spread of fake images, ensures the integrity of multi-modal data, rebuilds the foundation of digital trust, and establishes a new governance framework for cyberspace. ﹀
参考文献：	︿ [1]中国互联网络信息中心发布第53次《中国互联网络发展状况统计报告》[J].国家图书馆学刊,2024, 33(02): 104. [2]陈海鹏,刘宏昕,潘大力. 基于边界不确定性学习的图像篡改定位方法[J/OL].吉林大学学报(工学版),2025: 1–10. [3]Á. Gambin, A. Yazidi, A. Vasilakos. Deepfakes: current and future trends[J], Artificial Intelligence Review, 2024, 57: 64–92. [4]张汝波,蔺庆龙,张天一. 基于深度学习的图像篡改检测方法综述[J/OL].智能系统学报, 2025: 1–22. [5]刘晗, 李凯旋, 陈仪香. 人工智能系统可信性度量评估研究综述[J], 软件学报, 2023, 34(8): 3774–3792. [6]D. Tantaru, E. Oneata, D. Oneata. Weakly-supervised deepfake localization in diffusion-generated images[C]. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2024: 6258–6268. [7]Y Xu, J Zheng, A Fang, M Irfan. Feature enhancement and supervised contrastive learning for image splicing forgery detection[J]. Digital Signal Processing 2023,136: 1–17. [8]A. Amrullah, F. Ernawan, A F M. Raffei. TDSF: Two-phase tamper detection in semi-fragile watermarking using two-level integer wavelet transform[J]. Engineering Science and Technology, an International Journal, 2025, 61: 1–13. [9]C. Saharia, J. Ho, W. Chan. Image super-resolution via iterative refinement[J], IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(4): 4713–4726. [10]Q. Hao, R. Ren, K. Wang, S. Niu, J. Zhang, M. Wang. EC-Net: General image tampering localization network based on edge distribution guidance and contrastive learning[J]. Knowledge-Based Systems, 2024: 1–22. [11]R. Ren, Q. Hao, F. Gu, S. Niu, J. Zhang, M. Wang. EMF-Net: An edge-guided multi-feature fusion network for text manipulation detection[J]. Expert Systems with Applications, 2024: 1–15. [12]李伟,黄添强,黄丽清. 面向人脸修复篡改检测的大规模数据集[J]. 中国图象图形学报, 2024, 29(07): 1834-1848. [13]胡永健,卓思超,刘琲贝. 基于多尺度时空特征和篡改概率改善换脸检测的跨库性能[J]. 华南理工大学学报(自然科学版), 2024, 52(06): 110–119. [14]Z. Yang, J. Liang, Y. Xu, X. Zhang, R. He. Masked relation learning for deepfake detection[J], IEEE Transactions on Information Forensics and Security, 2023, 18: 1696–1708. [15]T. Wang, T. Cheng, H. Chow, L. Nie. Deep convolutional pooling transformer for deepfake detection[J], ACM Transactions on Multimedia Computing, Communications and Applications, 2023, 19(6): 1–20. [16]X. Li, R. Ni, P. Yang, Z. Fu, Y. Zhao. Artifacts-disentangled adversarial learning for deepfake detection[J], IEEE Transactions on Circuits and Systems for Video Technology, 2022, 33(4): 1658–1670. [17]C. Zhao, C. Wang, G. Hu, H. Chen, C. Liu, J. Tang. ISTVT: Interpretable spatial-temporal video transformer for deepfake detection[J], IEEE Transactions on Information Forensics and Security, 2023, 18: 1335–1348. [18]Y. Zhou, H. Wang, Q. Zeng, R. Zhang, S. Meng. Exploring weakly-supervised image manipulation localization with tampering edge-based class activation map[J]. Expert Systems with Applications, 2024, 249: 1–12. [19]刘亮,何雯晶,张磊. 基于注意力机制的渐进式图像复制粘贴篡改检测[J]. 四川大学学报(自然科学版), 2024, 61(04): 119–126. [20]吴晶辉,严彩萍,李红. 边缘引导的双注意力图像拼接检测网络[J]. 中国图象图形学报, 2024, 29(02): 430–443. [21]陈海鹏,张世博,吕颖达. 多尺度感知与边界引导的图像篡改检测方法[J/OL]. 吉林大学学报(工学版),2025: 1–8. [22]S. Das, M. Amin. Learning interpretable forensic representations via local window modulation[C]. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 436–447. [23]Z. Chen, Y. Zhang, J. Gu, L. Kong, X. Yang, F. Yu. Dual aggregation transformer for image super-resolution[C]. In: Proceedings of the IEEE/CVF international conference on computer vision. 2023: 12312–12321. [24]D. Li, J. Zhu, M. Wang, J. Liu, X. Fu, Z. Zha. Edge-aware regional message passing controller for image forgery localization[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 8222–8232. [25]F. Niloy, K. Bhaumik,S. Woo. Cfl-net: Image forgery localization using contrastive learning[C]. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2023: 4642–4651. [26]V. Verma, D. Singh, N. Khanna. Block-level double JPEG compression detection for image forgery localization[J]. Multimedia Tools and Applications, 2024, 83(4): 9949–9971. [27]Y. Zhang, Q. Tan, S. Qi, M. Xue. Prnu-based image forgery localization with deep multi-scale fusion[J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2023, 19(2): 1–20. [28]X. Xia, L. Su, S. Wang, X. Li. DMFF-Net: Double-stream multilevel feature fusion network for image forgery localization[J]. Engineering Applications of Artificial Intelligence, 2024, 127: 1–13. [29]Jalab A, Alqarni A, et al. A Novel Pixel’s Fractional Mean-Based Image Enhancement Algorithm for Better Image Splicing Detection[J]. Journal of King Saud University - Science, 2022, 34(2): 1–8. [30]Das, Debjit, and Ruchira Naskar. Image splicing detection using low-dimensional feature vector of texture features and Haralick features based on Gray Level Co-occurrence Matrix[J].Signal Processing: Image Communication 2024, 125: 1–16. [31]Meena, Kunj B, et al. Image splicing forgery detection using noise level estimation[J]. Multimedia Tools and Applications, 2021: 1–18. [32]C. Yan, H. Wei, Z. Lan. MSA-Net: Multi-scale attention network for image splicing localization[J]. Multimedia Tools and Applications, 2024, 83(7): 20587–20604. [33]Boato, G., De Natale, F. G., De Stefano, G., Pasquini, C., & Roli, F. Adversarial mimicry attacks against image splicing forensics: An approach for jointly hiding manipulations and creating false detections[J]. Pattern Recognition Letters, 2024, 179: 73–79. [34]Z. Yang, B. Liu, X. Bi. D-Net: A dual-encoder network for image splicing forgery detection and localization[J]. Pattern Recognition, 2024, 155: 1–12. [35]J. Hou, X. Wang, R. Han. Image splicing region localization with adaptive multi-feature filtration[J]. Expert Systems with Applications, 2024, 247: 1–18. [36]J. Hu, R. Xue, G. Teng. Image splicing manipulation location by multi-scale dual-channel supervision[J]. Multimedia Tools and Applications, 2024, 83(11): 31759–31782. [37]X. Wang. Accurate and robust image copy-move forgery detection using adaptive keypoints and FQGPCET-GLCM feature[J] Multimedia Tools and Applications. 2024, 83: 2203–2235. [38]Y. Li, Y. He, C. Chen. Image copy-move forgery detection via deep patchmatch and pairwise ranking learning[J]. IEEE Transactions on Image Processing. 2024, 34: 425–440. [39]J. Wang, X. Gao, J. Nie. Strong robust copy-move forgery detection network based on layer-by-layer decoupling refinement[J]. Information Processing & Management,2024, 61(3): 1–25. [40]Y. Li, L. Ye, H. Cao. Cascaded adaptive graph representation learning for image copy-move forgery detection[J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2024, 21(2): 1–24. [41]Y Rao, J Ni. A deep learning approach to detection of splicing and copy-move forgeries in images. In Proceedings of IEEE international workshop on information forensics and security (WIFS). IEEE, 2016: 1–6. [42]J. Bappy, A. Roy-Chowdhury, J. Bunk, L. Nataraj, B. Manjunath. Exploiting spatial structure for localizing manipulated image regions[C]. In Proceedings of the IEEE international conference on computer vision. 2017: 4970–4979. [43]J. Bappy, C. Simons, L. Nataraj, B. Manjunath, A. Roy-Chowdhury. Hybrid LSTM and encoder–decoder architecture for detection of image forgeries[J]. IEEE transactions on image processing, 2019, 28(7): 3286–3300. [44]P. Zhou, X. Han, V. Morariu, L. Davis. Learning rich features for image manipulation detection[C]. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 1053–1061. [45]Y. Wu, W. Abd Almageed, P. Natarajan. Mantra-net: Manipulation tracing network for detection and localization of image forgeries with anomalous features[C]. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 9543–9552. [46]X. Hu, Z. Zhang, Z. Jiang, S. Chaudhuri, Z. Yang, R. Nevatia. SPAN: Spatial pyramid attention network for image manipulation localization[C]. In: European Conference on Computer Vision. 2020: 312–328. [47]C. Dong, X. Chen, R. Hu, J. Cao, X. Li. Mvss-net: Multi-view multi-scale supervised networks for image manipulation detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(3): 3539–3553. [48]X. Liu, Y. Liu, J. Chen, X. Liu. PSCC-Net: Progressive spatio-channel correlation network for image manipulation detection and localization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(11): 7505–7517. [49]J. Wang, Z. Wu, J. Chen, X. Han, A. Shrivastava, S. Lim, Y. Jiang. Objectformer for image manipulation detection and localization[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 2364–2373. [50]X. Guo, X. Liu, Z. Ren, Steven Grosz, Iacopo Masi, X Liu. Hierarchical fine-grained image forgery detection and localization[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 3155–3165. [51]V. Asnani, X. Yin, T. Hassner, X. Liu. Malp: Manipulation localization using a proactive scheme[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 12343–12352. [52]F. Guillaro, D. Cozzolino, A. Sud, N. Dufour, L Verdoliva. Trufor: Leveraging all-round clues for trustworthy image forgery detection and localization[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 20606–20615. [53]H. Wu, J. Zhou, J. Tian, J. Liu, Y. Qiao. Robust image forgery detection against transmission over online social networks[J]. IEEE Transactions on Information Forensics and Security, 2022, 17: 443–456. [54]R. Bai, Image Manipulation Detection and Localization using Multi-Scale Contrastive Learning[J]. Applied Soft Computing, 2024, 163: 1–14. [55]H. Zhu, G. Cao, M. Zhao, H. Tian, W. Lin. Effective image tampering localization with multi-scale convnext feature fusion[J]. Journal of Visual Communication and Image Representation. 2024, 98: 1–7. [56]K. Guo, H. Zhu, G. Cao. Effective image tampering localization via enhanced transformer and co-attention fusion[C]. ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024: 4895–4899. [57]Y. Wu, W. Abd-Almageed, P. Natarajan. Busternet: Detecting copy-move image forgery with source/target localization[C]. In: Proceedings of the European Conference on Computer Vision. 2018: 168–184. [58]A. Islam, C. Long, A. Basharat, A. Hoogs. Doa-gan: Dual-order attentive generative adversarial network for image copy-move forgery detection and localization[C]. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 4676–4685. [59]P. Korshunov, S. Marcel. Improving generalization of deepfake detection with data farming and few-shot learning[J], IEEE Transactions on Biometrics, Behavior, and Identity Science, 2022, 4(3): 386–397. [60]Y. Yu, X. Zhao, R. Ni, S. Yang, Y. Zhao, A. Kot. Augmented multi-scale spatio-temporal inconsistency magnifier for generalized deepfake detection[J], IEEE Transactions on Multimedia, 2023, 99: 1–13. [61]X. Liu, Y. Yu, X. Li, Y. Zhao, G. Guo. TCSD: Triple complementary streams detector for comprehensive deepfake detection[J], ACM Transactions on Multimedia Computing, Communications and Applications, 2023, 19(6): 1–22. [62]W. Yang, X. Zhou, Z. Chen, B. Guo, Z. Ba, Z. Xia, X. Cao, K. Ren. AVoiD-DF: Audio-Visual joint learning for detecting deepfake[J], IEEE Transactions on Information Forensics and Security, 2023, 18: 2015–2029. [63]J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132–7141. [64]S. Woo, J. Park, J. Lee, I. Kweon. Cbam: Convolutional block attention module[C]. In: Proceedings of the European Conference on Computer Vision. 2018: 3–19. [65]Q. Zhang, Y. Yang. Sa-net: Shuffle attention for deep convolutional neural networks[C]. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021: 2235–2239. [66]C. Yang, H. Li, F. Lin, B. Jiang, H. Zhao. Constrained R-CNN: A general image manipulation detection model[C]. In: 2020 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2020: 1–6. [67]I. Masi, A. Killekar, R. Mascarenhas, S. Gurudatt, W. AbdAlmageed. Two-Branch recurrent network for isolating deepfakes in videos[C]. In: European Conference on Computer Vision. 2020: 667–684. [68]S. Das, M. Islam, M. Amin. Gca-net: Utilizing gated context attention for improving image forgery localization and detection[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 81–90. [69]M. Alhaidery, A. Taherinia, H. Shahadi. A robust detection and localization technique for copy-move forgery in digital images[J]. Journal of King Saud University-Computer and Information Sciences, 2023, 35(1): 449–461. [70]S. Nandi, P. Natarajan, W. Abd-Almageed. TrainFors: A large benchmark training dataset for image manipulation detection and localization[C]. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 403–414. [71]Hong-an Li, Man Liu, Jiangwen Fan, Qingfang Liu. Biomedical image segmentation algorithm based on dense atrous convolution[J]. Mathematical Biosciences and Engineering, 2024, 21(3): 4351–4369. [72]M. Sabeena, L. Abraham. Convolutional block attention based network for copy-move image forgery detection[J]. Multimedia Tools and Applications, 2024, 83(1): 2383–2405. [73]D. Xu, X. Shen, Y. Lyu. UP-Net: Uncertainty-Supervised parallel network for image manipulation localization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(11): 6390–6403. [74]R. Ren, Q. Hao, S. Niu, K. Xiong, J. Zhang, M. Wang. MFI-Net: Multi-feature fusion identification networks for artificial intelligence manipulation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 2: 1266–1280. [75]Y. Zeng, B. Zhao, S. Qiu, T. Dai, S. Xia. Toward effective image manipulation detection with proposal contrastive learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(9): 4703–4714. [76]张汝波,蔺庆龙,张天一. 基于深度学习的图像篡改检测方法综述 [J/OL]. 智能系统学报, 2025: 1–22. [77]X. Xu, J. Chen, W. Lv. Image Tampering Detection with Frequency-Aware Attention and Multi-View Fusion[J]. IEEE Transactions on Artificial Intelligence, 2024, 6(3): 614–625. [78]李树原,严彩萍,李红. 用于图像篡改检测的混合Transformer网络[J]. 计算机辅助设计与图形学学报, 2024, 36(12): 2010–2019. [79]X. Tian, J. Zhao, L. Wen. DUFormer: dual-channel image splicing detection based on anchor-shaped U-Net and stepwise transformer for power systems. Signal, Image and Video Processing, 2024, 18(10): 7237–7245. [80]C. Zeng, K. Li, Z. Wang. Enfformer: long-short term representation of electric network frequency for digital audio tampering detection[J]. Knowledge-Based Systems, 2024, 297: 1–18. [81]A. Yadav, D. Gupta, D. K. Vishwakarma. Uncovering visual attention-based multi-level tampering traces for face forgery detection[J]. Signal, Image and Video Processing, 2024, 18(2): 1259–1272. [82]Z. Yu, J. Ni, Y. Lin, H. Deng. Diffforensics: Leveraging diffusion prior to image forgery detection and localization[C]. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024: 12765–12774. ﹀
中图分类号：	TP391.41; TP18
开放日期：	2025-06-17

附件下载