论文中文题名: | 融合多尺度特征的图像篡改定位算法研究 |
姓名: | |
学号: | 22208223054 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 085400 |
学科名称: | 工学 - 电子信息 |
学生类型: | 硕士 |
学位级别: | 工学硕士 |
学位年度: | 2025 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 计算机视觉 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2025-06-16 |
论文答辩日期: | 2025-05-29 |
论文外文题名: | Research on image tampering localization algorithm integrating multi-scale features |
论文中文关键词: | |
论文外文关键词: | Image tampering location ; Multi-scale fusion ; Attention mechanism ; Boundary artifact location ; Lightweight network |
论文中文摘要: |
数字图像作为信息传播的核心介质,已深度融入社交媒体、司法取证、医疗诊断及新闻传播等社会关键领域。伴随着图像篡改技术的迅猛发展,图像的可视化特征催生了新型数字安全风险,技术突破不断颠覆“眼见为实”的认知根基,亟需针对图像篡改定位的研究。现有图像篡改定位方法存在局限,大多数研究仅针对特定篡改类型,对复合篡改场景的泛化能力不足,缺乏针对频域特性设计的研究,普遍面临计算复杂度与存储效率的失衡问题,在应用部署场景中尤为突出。本文研究融合多尺度特征的图像篡改定位方法,具体内容如下: (1)针对图像篡改定位方法对复合篡改场景建模不足、感知不充分的问题,本文提出一种多尺度迭代篡改定位算法(Multi-scale Iterative Tamper Detection Network, MITD-Net),旨在充分利用图像的多尺度特征,实现通用场景下对篡改区域的准确定位。将图像篡改定位分解为特征同步和区域细化两个阶段,采用并行网络架构捕获不同尺度特征信息。设计像素级特征聚类模块,集成跨尺度的局部特征和全局上下文信息,捕获空间和通道维度特征相关性。为解决篡改尺度变化问题,构建增强下采样注意块,设计边缘增强模块处理边界信息。MITD-Net适用于各类复杂场景,通过在Columbia、CASIA等数据集上的实验结果验证其效果。 (2)针对图像篡改定位频域特征提取不足、压缩特性考虑不充分的问题,本文提出一种图像拼接篡改定位的伪影识别与追踪算法(Artifact Recognition and Tracing Network for Image Splicing Detection, ART-Net),旨在突破传统方法单域定位的局限性。为充分利用频域与图像特征,构建双域协同分析,设计具有空-频联合感知能力的JPEG伪影学习模块,建立压缩痕迹与视觉伪影的关联映射,提升跨格式压缩痕迹的泛化识别能力。ART-Net方法在各类压缩环境中表现出色,通过在IMD20和Spliced COCO等拼接数据集上的实验结果验证该方法的有效性。 (3)针对图像篡改定位长期以来一直依赖于微观特征、参数量大且定位效率低的问题,本文提出一种介观尺度轻量化篡改定位算法(Mesoscopic Lightweight Tampering Localization Network, MLT-Net),旨在突破传统定位任务依赖单一特征的局限。构建多尺度特征协同预测模块,基于宏观与微观级别提取高频和低频特征,动态调整缩放权重以突出介观尺度。针对网络结构的定位效率问题,提出动态可适跨粒度加权模块,应用剪枝以优化参数,通过在Coverage、Columbia、NIST16和CASIA等数据集上验证MLT-Net的良好性能。 为解决复杂篡改图像识别能力不足、特征提取不充分、模型复杂的问题,本文研究融合多尺度特征的图像篡改定位方法。在篡改内容泛滥的数字化社会中,有助于遏制虚假图像的病毒式扩散,保障多模态数据的完整性,重建数字信任基石,形成网络空间治理的新秩序。 |
论文外文摘要: |
Digital images, as a core medium for information dissemination, have deeply integrated into key social fields such as social media, judicial forensics, medical diagnosis, and news communication. With the rapid development of image manipulation technologies, the visual characteristics of images have introduced new digital security risks. Technological breakthroughs continuously challenge the notion of "seeing is believing," urgently requiring research on image tampering localization. Existing methods still face limitations: most studies focus on specific manipulation types with weak generalization in composite tampering scenarios, lack frequency-domain feature designs, and struggle with balancing computational complexity and storage efficiency, especially in practical deployments. This research explores image tampering localization through multi-scale feature fusion, with specific contributions as follows: (1) To address insufficient modeling and perception of composite tampering scenarios, we propose a Multi-scale Iterative Tamper Detection Network (MITD-Net). This network leverages multi-scale features for accurate localization in general scenarios through a two-stage process: feature synchronization and region refinement. It employs parallel network architecture to capture multi-scale features, integrates cross-scale local features and global context via a pixel-level feature clustering module, and addresses scale variations using an enhanced downsampling attention block. An edge enhancement module improves boundary processing. MITD-Net demonstrates strong performance on Columbia and CASIA datasets across complex scenarios. (2) For insufficient frequency-domain analysis and compression characteristics consideration, we develop an Artifact Recognition and Tracing Network (ART-Net) for image splicing detection. This method breaks traditional single-domain limitations through RGB-DCT dual-domain analysis. A JPEG artifact learning module with spatial-frequency joint perception establishes mappings between compression traces and visual artifacts, enhancing cross-format generalization. ART-Net shows excellent performance on IMD20 and Spliced COCO datasets under various compression conditions. (3) To overcome traditional methods' reliance on micro-features with high parameters and low efficiency, we propose a Mesoscopic Lightweight Tampering Localization Network (MLT-Net). It integrates multi-scale features through macro-micro feature extraction and dynamic scaling weights to emphasize mesoscopic characteristics. A dynamic cross-granularity weighting module and network pruning optimize parameter efficiency. MLT-Net achieves strong performance on Coverage, Columbia, NIST16, and CASIA datasets. To address the challenges of insufficient complex tampering detection, inadequate feature extraction, and overly complex models, this study achieves digital image tampering localization by integrating multi-scale features. In today’s digital society plagued by manipulated content, this work helps curb the viral spread of fake images, ensures the integrity of multi-modal data, rebuilds the foundation of digital trust, and establishes a new governance framework for cyberspace. |
参考文献: |
[1]中国互联网络信息中心发布第53次《中国互联网络发展状况统计报告》[J].国家图书馆学刊,2024, 33(02): 104. [2]陈海鹏,刘宏昕,潘大力. 基于边界不确定性学习的图像篡改定位方法[J/OL].吉林大学学报(工学版),2025: 1–10. [4]张汝波,蔺庆龙,张天一. 基于深度学习的图像篡改检测方法综述[J/OL].智能系统学报, 2025: 1–22. [5]刘晗, 李凯旋, 陈仪香. 人工智能系统可信性度量评估研究综述[J], 软件学报, 2023, 34(8): 3774–3792. [12]李伟,黄添强,黄丽清. 面向人脸修复篡改检测的大规模数据集[J]. 中国图象图形学报, 2024, 29(07): 1834-1848. [13]胡永健,卓思超,刘琲贝. 基于多尺度时空特征和篡改概率改善换脸检测的跨库性能[J]. 华南理工大学学报(自然科学版), 2024, 52(06): 110–119. [19]刘亮,何雯晶,张磊. 基于注意力机制的渐进式图像复制粘贴篡改检测[J]. 四川大学学报(自然科学版), 2024, 61(04): 119–126. [20]吴晶辉,严彩萍,李红. 边缘引导的双注意力图像拼接检测网络[J]. 中国图象图形学报, 2024, 29(02): 430–443. [21]陈海鹏,张世博,吕颖达. 多尺度感知与边界引导的图像篡改检测方法[J/OL]. 吉林大学学报(工学版),2025: 1–8. [76]张汝波,蔺庆龙,张天一. 基于深度学习的图像篡改检测方法综述 [J/OL]. 智能系统学报, 2025: 1–22. [78]李树原,严彩萍,李红. 用于图像篡改检测的混合Transformer网络[J]. 计算机辅助设计与图形学学报, 2024, 36(12): 2010–2019. |
中图分类号: | TP391.41; TP18 |
开放日期: | 2025-06-17 |