查看论文信息

免费浏览

查看论文信息

论文中文题名：	面向图像识别模型的对抗样本生成方法研究
姓名：	康建寅
学号：	20208223048
保密级别：	保密（1年后开放）
论文语种：	chi
学科代码：	085400
学科名称：	工学 - 电子信息
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2023
培养单位：	西安科技大学
院系：	计算机科学与技术学院
专业：	计算机技术
研究方向：	人工智能安全
第一导师姓名：	于振华
第一导师单位：	西安科技大学
论文提交日期：	2023-06-20
论文答辩日期：	2023-06-06
论文外文题名：	Research on adversarial example generation method for image recognition model
论文中文关键词：	对抗样本 ; 黑盒攻击 ; 粒子群优化 ; 邻域重分布 ; 子空间随机采样
论文外文关键词：	Adversarial Example ; Black-box Attacks ; Particle Swarm Optimization ; Neighborhood Redistribution ; Subspace Random Sampling
论文中文摘要：	︿随着人工智能技术的发展，深度神经网络在无人驾驶、人脸识别、目标检测等领域得到了广泛应用，其安全性问题也逐渐引起人们的重视。部分基于深度神经网络的图像识别模型容易受到添加到原始样本上恶意扰动的影响，输出错误识别结果，从而暴露出存在的安全性漏洞。为了发掘模型自身的潜在漏洞，进一步采取防御措施提升其安全性和鲁棒性，需要研究有效的对抗样本生成方法。在对抗样本生成方法中，通过黑盒攻击生成对抗样本在现实场景中较为常见，具有实际研究价值。现有黑盒攻击方法仍存在一些问题：(1) 黑盒攻击方法往往需要频繁访问目标模型，攻击效率较低，容易被察觉导致攻击失败；(2) 在生成对抗样本时，为了有效欺骗目标模型，难以避免在原始样本上添加过多扰动，导致对抗样本的图像质量较差。根据以上问题，本文进行如下研究： (1) 针对大部分黑盒攻击方法攻击效率不高这一问题，本文结合智能进化算法提出一种拓扑自适应粒子群优化的黑盒攻击方法，以快速生成对抗样本。首先，利用原始图像随机生成初始对抗样本种群；然后，根据邻域信息计算各样本的扰动并在搜索空间内迭代，使用动态惩罚项系数控制样本的适应度值，并提出具有邻域重分布的记忆搜索策略来加速对抗样本搜索过程；最后，对生成的样本进行修剪，以获得最终的对抗样本。在ImageNet数据集上对基于InceptionV3的目标模型进行攻击时，该方法对目标模型的平均访问次数相比其他方法降低了11%，具有更高的攻击效率。 (2) 针对大部分黑盒攻击方法生成的对抗样本图像质量较差这一问题，本文提出一种区域性子空间随机采样的黑盒攻击方法，在保持对抗性的同时生成微弱扰动。首先，使用热力图将图像划分为不同的关注区域，在高关注区域上随机生成初始扰动集；然后，通过分配概率权重来挑选较优扰动，以计算采样子空间；最后，将在采样子空间上生成的扰动集并入初始扰动集并继续迭代，采用重进化策略进一步减少扰动，提升对抗样本的图像质量。在ImageNet数据集上对基于InceptionV3的目标模型进行攻击时，对抗样本的平均单位扰动相比其他方法降低了38%。该方法能生成微弱扰动，获得图像质量更好的对抗样本。 (3) 以提出的方法为基础，设计并开发了一个对抗样本生成系统，实现对抗攻击参数设置、对抗样本对比以及指标可视化等功能。通过该原型系统，验证所提对抗攻击方法的应用可行性，为挖掘人工智能模型存在的安全漏洞提供支持。﹀
论文外文摘要：	︿ With the development of artificial intelligence technology, deep neural networks are widely used in the fields of unmanned vehicles, face recognition, target detection, etc., and the research on their security issues have gradually attracted more and more attention. Some deep neural network-based image recognition models are vulnerable to malicious perturbations added to the original example, and result in some incorrect results, which expose the existence of security vulnerabilities. In order to explore the potential vulnerabilities of the target model and take further defensive measures to improve security and robustness, effective adversarial example generation methods require to be studied. Among the adversarial example generation methods, the generation of adversarial example by black-box attacks is more common in realistic scenarios and has practical research value. There are still some issues in the existing black-box attack methods: (1) the black-box attack methods often require frequent access to the target model, which is less efficient and easily be detected and leads to the failure of the attack; (2) when generating the adversarial example, it is difficult to avoid adding too much perturbation to the original example in order to effectively deceive the target model, resulting in poor image quality of the adversarial example. Based on the above problems, this thesis conducts the following studies: (1) To address the problem of inefficient attacks in black box attack methods, this thesis proposes a black-box attack method with the use of topological adaptive particle swarm optimization combined with intelligent evolutionary algorithms to quickly generate adversarial example. First, the initial population of adversarial example is randomly generated based on the original image. Then, the perturbation of each example is calculated based on the neighborhood information and iterated in the search space, the fitness value of the examples is controlled using dynamic penalty term coefficients, and a memory search strategy with neighborhood redistribution is proposed to accelerate the search process of adversarial example. Finally, the generated examples are pruned to obtain the final adversarial example. When attacking the InceptionV3-based target model on the ImageNet dataset, the proposed method has a higher attack efficiency by reducing the average number of accesses to the target model by 11% compared to other methods. (2) To address the problem of poor image quality of the adversarial example generated by the black-box attack methods, this thesis proposes a black-box attack method based on the regional subspace random sampling to generate weak perturbations while maintaining the adversarial nature. First, the image is divided into different regions of interest using class activation map, and the initial perturbation set is randomly generated on the high region of interest. Then, the optimal perturbation is selected by assigning probability weights to calculate the sampling subspace. Finally, the perturbation set generated on the sampling subspace is merged into the initial perturbation set and continues to iterate, and the re-evolution strategy is used to further reduce the perturbation and improve the image quality of the adversarial example. When attacking the InceptionV3-based target model on the ImageNet dataset, the average unit perturbation of the adversarial examples is reduced by 38% compared to the other methods. The proposed method can generate weak perturbations and obtain adversarial examples with better image quality. (3) Based on the proposed method, an adversarial example generation system is designed and developed to realize the functions of adversarial attack parameter setting, adversarial example comparison and metrics visualization. The prototype system is used to verify the application feasibility of the proposed adversarial attack method and to provide support for mining the security vulnerabilities of the artificial intelligence model. ﹀
参考文献：	︿ [1]Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE Computer Society Press, 2017: 7263−7271. [2]Li C Y, Vu N T. Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN[C] //Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop. Cartagena: IEEE Computer Society Press, 2021: 830−836. [3]Meng Q, Zhao S C, Huang Z D, et al. MagFace: A Universal Representation for Face Recognition and Quality Assessment[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Nashville: IEEE Computer Society Press, 2021: 14220−14229. [4]Tu Y H, Du J, Lee C H. Speech Enhancement Based on Teacher–Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019, 27(12): 2080−2091. [5]Yu K. Deep Learning for Unsupervised Neural Machine Translation[C] //Proceedings of the International Conference on Big Data & Artificial Intelligence & Software Engineering. Zhuhai: IEEE Computer Society Press, 2021: 614−617. [6]Wang S, Li Q M, Cui Z Y, et al. Bandit-based data poisoning attack against federated learning for autonomous driving models[J]. Expert Systems with Applications, 2023, 227: 120295. [7]Li P X, Jin J Y. Time3D: End-to-End Joint Monocular 3D Object Detection and Tracking for Autonomous Driving [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE Computer Society Press, 2022: 3875−3884. [8]Szegedy C, Zaremba W, Sutskever I, et al. Intriguing properties of neural networks[C] //Proceedings of the International Conference on Learning Representations. OpenReview.net, 2014: 641−650. [9]Eykholt K, Evtimov I, Fernandes E, et al. Robust physical-world attacks on deep learning visual classification[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2018: 1625−1634. [10]Gu T, DolanGavitt B, Garg S. BadNets: Identifying vulnerabilities in the machine learning model supply chain[J]. arXiv preprint arXiv:1708.06733, 2017. [11]Wang Y J, Lv H R, Kuang X H, et al. Towards a physical-world adversarial patch for blinding object detection models[J]. Information Sciences, 2021, 556: 459−471. [12]M. Sharif, S. Bhagavatula, L. Bauer, et al. Accessorize to a crime: Real and stealthy attacks on stateoftheart face recognition[C]. Proceedings of the 2016 ACM sigsac conference on computer and communications security, Vienna, Austria, 2016, 1528−1540. [13]Wang J, Wang C Y, Lin Q Z, et al. Adversarial attacks and defenses in deep learning for image recognition: A survey[J]. Neurocomputing, 2022, 514(9): 162−181. [14]Long T, Gao Q, Xu L L, et al. A survey on adversarial attacks in computer vision: Taxonomy, visualization and future directions[J]. Computers & Security, 2022, 121: 102847. [15]Goodfellow I J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples[C] //Proceedings of the International Conference on Learning Representations. OpenReview.net, 2015: 201−211. [16]Kurakin A, Goodfellow I J, Bengio S. Adversarial machine learning at scale[C] //Proceedings of the International Conference on Learning Representations. OpenReview.net, 2017: 1079−1083. [17]Aleksander M, Aleksander M, Ludwig S, et al. Towards Deep Learning Models Resistant to Adversarial Attacks[C] //Proceedings of the International Conference on Learning Representations. OpenReview.net, 2018: 1012−1034. [18]Papernot N, McDaniel P, Jha S, et al. The limitations of deep learning in adversarial settings[C] //Proceedings of the IEEE Symposium on Security and Privacy. Los Alamitos: IEEE Computer Society Press, 2016: 372−387. [19]Moosavi-Dezfooli S M, Fawzi A, Frossard P. DeepFool: a simple and accurate method to fool deep neural networks[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2016: 2574−2582. [20]Carlini N, Wagner D. Towards evaluating the robustness of neural networks[C] //Proceedings of the IEEE Symposium on Security and Privacy. Los Alamitos: IEEE Computer Society Press, 2017: 39−57. [21]Xiao C W, Li B, Zhu J Y, et al. Generating Adversarial Examples with Adversarial Networks[C] //Proceedings of the International Joint Conference on Artificial Intelligence. Stockholm: Morgan Kaufmann, 2018: 3905−3911. [22]Mangla P, Jandial S, Varshney S, et al. AdvGAN++: Harnessing latent layers for adversary generation[C] //Proceedings of the IEEE International Conference on Computer Vision Workshop. Seoul: IEEE Computer Society Press, 2019: 2045−2048. [23]Liu A S, Liu X L, Fan J X, et al. Perceptual-Sensitive GAN for Generating Adversarial Patches[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Hawaii: AAAI Press, 2019: 1028−1035. [24]Liu X Q, Hsieh C J. Rob-GAN: Generator, Discriminator, and Adversarial Attacker[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE Computer Society Press, 2019: 11234−11243. [25]Papernot N, McDaniel P, Goodfellow I, et al. Practical black-box attacks against machine learning[C] //Proceedings of the ACM on Asia Conference on Computer and Communications Security. New York: ACM Press, 2017: 506−519. [26]Wu W B, Su Y X, Chen X X, et al. Boosting the Transferability of Adversarial Samples via Attention[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Seattle: IEEE Computer Society Press, 2020: 1158−1167. [27]Li Y, Bai S, Zhou Y, et al. Learning Transferable Adversarial Examples via Ghost Networks[C] //Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI Press, 2020: 11458−11465. [28]Wang X S, He K. Enhancing the Transferability of Adversarial Attacks through Variance Tuning[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society Press, 2021: 1924−1933. [29]Li C, Yao W, Wang H D, et al. Adaptive momentum variance for attention-guided sparse adversarial attacks[J]. Pattern Recognition. 2023, 133: 108979. [30]Brendel W, Rauber J, Bethge M. Decision-based adversarial attacks: reliable attacks against black-box machine learning models[C] //Proceedings of the International Conference on Learning Representations. OpenReview.net, 2018: 442−453. [31]Chen J B, Jordan M I, Wainwright M J. HopSkipJumpAttack: A Query-Efficient Decision-Based Attack[C] //Proceedings of the IEEE Symposium on Security and Privacy. San Francisco: IEEE Computer Society Press, 2020: 1277−1294. [32]Li X C, Zhang X Y, Yin F, et al. Decision-Based Adversarial Attack with Frequency Mixup[J]. IEEE Transactions on Information Forensics and Security. 2022, 17: 1038−1052. [33]Chen P Y, Zhang H, Sharma Y, et al. Zoo: zeroth order optimization based black-box attacks to deep neural networks without training substitute models[C] //Proceedings of the ACM Workshop on Artifcial Intelligence and Security. New York: ACM Press, 2017: 15−26. [34]Tu C C, Ting P, Chen P Y, et al. AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Hawaii: AAAI Press, 2019: 742−749. [35]Alzantot M, Sharma Y, Chakraborty S, et al. GenAttack: practical black-box attacks with gradient-free optimization[C] //Proceedings of the Genetic and Evolutionary Computation Conference. New York: ACM Press, 2019: 1111−1119. [36]Mosli R, Wright M, Yuan B, et al. They might not be giants crafting black-box adversarial examples using particle swarm optimization[C] //Proceedings of the European Symposium on Research in Computer Security. Heidelberg: Springer, 2020: 439−459. [37]Suryanto N, Kang H, Kim Y, et al. A distributed black-box adversarial attack based on multi-group particle swarm optimization[J]. Sensors, 2020, 20(24): 7158. [38]Andriushchenko M, Croce F, Flammarion N, et al. Square attack: a query-efficient black-box adversarial attack via random search[C] //Proceedings of the European Conference on Computer Vision. Heidelberg: Springer, 2020: 484−501. [39]陈晋音, 陈治清, 郑海斌, 等. 基于粒子群优化的路牌识别模型的黑盒对抗攻击方法[J]. 软件学报, 2020, 31(09): 2785−2801. [40]陈晋音, 沈诗婧, 苏蒙蒙, 等. 车牌识别系统的黑盒对抗攻击[J]. 自动化学报, 2021, 47(01): 121−135. [41]Su J W, Vargas D V, Sakurai K. One pixel attack for fooling deep neural networks[J]. IEEE Transactions on Evolutionary Computation, 2019, 23(5): 828−841. [42]Chen J Y, Su M M, Shen S J, et al. POBA-GA: Perturbation optimized black-box adversarial attacks via genetic algorithm[J]. Computers & Security, 2019, 85(3): 89−106. [43]黄立峰, 庄文梓, 廖泳贤, 等. 一种基于进化策略和注意力机制的黑盒对抗攻击算法[J]. 软件学报, 2021, 32(11): 3512−3529. [44]Wang J, Yin Z X, Jiang J, et al. Attention-Guided Black-box Adversarial Attacks with Large-Scale Multiobjective Evolutionary Optimization[J]. arXiv preprint arXiv: 2101.07512, 2021. [45]Diego G, Francesco M, Luisa V, et al. Perceptual quality-preserving black-box attack against deep learning image classifiers[J]. Pattern Recognition Letters, 2021, 147(8): 142−149. [46]Wang J, Yin Z X, Tang J, et al. PICA: A Pixel Correlation-based Attentional Black-box Adversarial Attack[J]. arXiv preprint arXiv: 2101.07538, 2021. [47]Chen J Y, Zheng H B, Xiong H, et al. FineFool: A novel DNN object contour attack on image recognition based on the attention perturbation adversarial technique[J]. Computers & Security, 2021, 104: 102220. [48]Liu J, Jin H Y, Xu G X, et al. Aliasing black box adversarial attack with joint self-attention distribution and confidence probability[J]. Expert Systems With Applications, 2023, 214: 119110. [49]袁天昊, 吉顺慧, 张鹏程, 等. 针对黑盒智能语音软件的对抗样本生成方法[J]. 软件学报, 2022, 33(05): 1569−1586. [50]Akhtar N, Mian A, Kardan N, et al. Advances in Adversarial Attacks and Defenses in Computer Vision: A Survey[J]. IEEE Access, 2021, 9: 155161−155196. [51]Zhang J, Li C. Adversarial examples: Opportunities and challenges[J]. IEEE Transactions on Neural Networks and Learning Systems. 2020, 31(7): 2578−2593. [52]Simonyan K, Zisserman A. Very deep convolutional networks for largescale image recognition[J]. arXiv preprint arXiv: 1409.1556, 2014. [53]Szegedy C, Liu W, Jia Y Q, et al. Going Deeper with Convolutions[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2016: 770−778. [54]Yuan X Y, He P, Zhu Q L, et al. Adversarial examples: attacks and defenses for deep learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(9): 2805−2824. [55]Rathore P, Basak S, Nistala S H, et al. Untargeted, Targeted and Universal Adversarial Attacks and Defenses on Time Series[C] //Proceedings of International Joint Conference on Neural Networks. Shenzhen: IEEE Computer Society Press, 2020: 1−8. [56]Houssein E H, Gad A G, Hussain K, et al. Major advances in particle swarm optimization: theory, analysis, and application[J]. Swarm and Evolutionary Computation, 2021, 63: 100868. [57]冯茜, 李擎, 全威, 等. 多目标粒子群优化算法研究综述[J]. 工程科学学报, 2021, 43(6): 745−753. [58]Papernot N, McDaniel P, Wu X, et al. Distillation as a defense to adversarial perturbations against deep neural networks[C] //Proceedings of the IEEE Symposium on Security and Privacy. Los Alamitos: IEEE Computer Society Press, 2016: 582−597. [59]Zhou B, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2921−2929. [60]Thys S, Ranst W V, Goedemé T. Fooling automated surveillance cameras: adversarial patches to attack person detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Long Beach: IEEE Computer Society Press, 2019: 49−55. ﹀
中图分类号：	TP391
开放日期：	2024-06-20

附件下载