论文中文题名: | 面向图像识别模型的对抗样本生成方法研究 |
姓名: | |
学号: | 20208223048 |
保密级别: | 保密(1年后开放) |
论文语种: | chi |
学科代码: | 085400 |
学科名称: | 工学 - 电子信息 |
学生类型: | 硕士 |
学位级别: | 工程硕士 |
学位年度: | 2023 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 人工智能安全 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2023-06-20 |
论文答辩日期: | 2023-06-06 |
论文外文题名: | Research on adversarial example generation method for image recognition model |
论文中文关键词: | |
论文外文关键词: | Adversarial Example ; Black-box Attacks ; Particle Swarm Optimization ; Neighborhood Redistribution ; Subspace Random Sampling |
论文中文摘要: |
随着人工智能技术的发展,深度神经网络在无人驾驶、人脸识别、目标检测等领域得到了广泛应用,其安全性问题也逐渐引起人们的重视。部分基于深度神经网络的图像识别模型容易受到添加到原始样本上恶意扰动的影响,输出错误识别结果,从而暴露出存在的安全性漏洞。为了发掘模型自身的潜在漏洞,进一步采取防御措施提升其安全性和鲁棒性,需要研究有效的对抗样本生成方法。 在对抗样本生成方法中,通过黑盒攻击生成对抗样本在现实场景中较为常见,具有实际研究价值。现有黑盒攻击方法仍存在一些问题:(1) 黑盒攻击方法往往需要频繁访问目标模型,攻击效率较低,容易被察觉导致攻击失败;(2) 在生成对抗样本时,为了有效欺骗目标模型,难以避免在原始样本上添加过多扰动,导致对抗样本的图像质量较差。 根据以上问题,本文进行如下研究: (1) 针对大部分黑盒攻击方法攻击效率不高这一问题,本文结合智能进化算法提出一种拓扑自适应粒子群优化的黑盒攻击方法,以快速生成对抗样本。首先,利用原始图像随机生成初始对抗样本种群;然后,根据邻域信息计算各样本的扰动并在搜索空间内迭代,使用动态惩罚项系数控制样本的适应度值,并提出具有邻域重分布的记忆搜索策略来加速对抗样本搜索过程;最后,对生成的样本进行修剪,以获得最终的对抗样本。在ImageNet数据集上对基于InceptionV3的目标模型进行攻击时,该方法对目标模型的平均访问次数相比其他方法降低了11%,具有更高的攻击效率。 (2) 针对大部分黑盒攻击方法生成的对抗样本图像质量较差这一问题,本文提出一种区域性子空间随机采样的黑盒攻击方法,在保持对抗性的同时生成微弱扰动。首先,使用热力图将图像划分为不同的关注区域,在高关注区域上随机生成初始扰动集;然后,通过分配概率权重来挑选较优扰动,以计算采样子空间;最后,将在采样子空间上生成的扰动集并入初始扰动集并继续迭代,采用重进化策略进一步减少扰动,提升对抗样本的图像质量。在ImageNet数据集上对基于InceptionV3的目标模型进行攻击时,对抗样本的平均单位扰动相比其他方法降低了38%。该方法能生成微弱扰动,获得图像质量更好的对抗样本。 (3) 以提出的方法为基础,设计并开发了一个对抗样本生成系统,实现对抗攻击参数设置、对抗样本对比以及指标可视化等功能。通过该原型系统,验证所提对抗攻击方法的应用可行性,为挖掘人工智能模型存在的安全漏洞提供支持。 |
论文外文摘要: |
With the development of artificial intelligence technology, deep neural networks are widely used in the fields of unmanned vehicles, face recognition, target detection, etc., and the research on their security issues have gradually attracted more and more attention. Some deep neural network-based image recognition models are vulnerable to malicious perturbations added to the original example, and result in some incorrect results, which expose the existence of security vulnerabilities. In order to explore the potential vulnerabilities of the target model and take further defensive measures to improve security and robustness, effective adversarial example generation methods require to be studied. Among the adversarial example generation methods, the generation of adversarial example by black-box attacks is more common in realistic scenarios and has practical research value. There are still some issues in the existing black-box attack methods: (1) the black-box attack methods often require frequent access to the target model, which is less efficient and easily be detected and leads to the failure of the attack; (2) when generating the adversarial example, it is difficult to avoid adding too much perturbation to the original example in order to effectively deceive the target model, resulting in poor image quality of the adversarial example. Based on the above problems, this thesis conducts the following studies: (1) To address the problem of inefficient attacks in black box attack methods, this thesis proposes a black-box attack method with the use of topological adaptive particle swarm optimization combined with intelligent evolutionary algorithms to quickly generate adversarial example. First, the initial population of adversarial example is randomly generated based on the original image. Then, the perturbation of each example is calculated based on the neighborhood information and iterated in the search space, the fitness value of the examples is controlled using dynamic penalty term coefficients, and a memory search strategy with neighborhood redistribution is proposed to accelerate the search process of adversarial example. Finally, the generated examples are pruned to obtain the final adversarial example. When attacking the InceptionV3-based target model on the ImageNet dataset, the proposed method has a higher attack efficiency by reducing the average number of accesses to the target model by 11% compared to other methods. (2) To address the problem of poor image quality of the adversarial example generated by the black-box attack methods, this thesis proposes a black-box attack method based on the regional subspace random sampling to generate weak perturbations while maintaining the adversarial nature. First, the image is divided into different regions of interest using class activation map, and the initial perturbation set is randomly generated on the high region of interest. Then, the optimal perturbation is selected by assigning probability weights to calculate the sampling subspace. Finally, the perturbation set generated on the sampling subspace is merged into the initial perturbation set and continues to iterate, and the re-evolution strategy is used to further reduce the perturbation and improve the image quality of the adversarial example. When attacking the InceptionV3-based target model on the ImageNet dataset, the average unit perturbation of the adversarial examples is reduced by 38% compared to the other methods. The proposed method can generate weak perturbations and obtain adversarial examples with better image quality. (3) Based on the proposed method, an adversarial example generation system is designed and developed to realize the functions of adversarial attack parameter setting, adversarial example comparison and metrics visualization. The prototype system is used to verify the application feasibility of the proposed adversarial attack method and to provide support for mining the security vulnerabilities of the artificial intelligence model. |
参考文献: |
[39]陈晋音, 陈治清, 郑海斌, 等. 基于粒子群优化的路牌识别模型的黑盒对抗攻击方法[J]. 软件学报, 2020, 31(09): 2785−2801. [40]陈晋音, 沈诗婧, 苏蒙蒙, 等. 车牌识别系统的黑盒对抗攻击[J]. 自动化学报, 2021, 47(01): 121−135. [43]黄立峰, 庄文梓, 廖泳贤, 等. 一种基于进化策略和注意力机制的黑盒对抗攻击算法[J]. 软件学报, 2021, 32(11): 3512−3529. [49]袁天昊, 吉顺慧, 张鹏程, 等. 针对黑盒智能语音软件的对抗样本生成方法[J]. 软件学报, 2022, 33(05): 1569−1586. [57]冯茜, 李擎, 全威, 等. 多目标粒子群优化算法研究综述[J]. 工程科学学报, 2021, 43(6): 745−753. |
中图分类号: | TP391 |
开放日期: | 2024-06-20 |