- 无标题文档
查看论文信息

论文中文题名:

 基于图神经网络的嵌入式设备固件漏洞检测的研究    

姓名:

 慕涛涛    

学号:

 20207040036    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 0810    

学科名称:

 工学 - 信息与通信工程    

学生类型:

 硕士    

学位级别:

 工学硕士    

学位年度:

 2023    

培养单位:

 西安科技大学    

院系:

 通信与信息工程学院    

专业:

 信息与通信工程    

研究方向:

 网络安全    

第一导师姓名:

 姚军    

第一导师单位:

 西安科技大学    

论文提交日期:

 2023-06-15    

论文答辩日期:

 2023-05-30    

论文外文题名:

 Research on embedded device firmware vulnerability detection based on graph neural network    

论文中文关键词:

 嵌入式设备 ; 漏洞检测 ; 代码属性图 ; 图神经网络 ; 注意力机制    

论文外文关键词:

 Embedded devices ; Vulnerability detection ; Code property graphs ; Graph neural networks ; Attention mechanism    

论文中文摘要:

随着嵌入式设备种类和数量日益繁多,嵌入式设备的安全性面临着巨大挑战。攻击者利用嵌入式设备固件中存在的软件漏洞获取设备控制权限,从而窃取用户数据信息或传播大量恶意代码,对嵌入式设备造成了极大的威胁。因此,如何快速且准确地发现嵌入式设备固件中存在的软件漏洞一直是信息安全领域的研究热点。

基于机器学习的漏洞检测方法需要将源代码转化为特定的表示形式,这些表示形式只保留了源代码中的部分语法语义信息,无法覆盖各种漏洞类型。针对上述问题,本文提出了一种基于代码属性图及图神经网络的固件程序漏洞检测方法,从源代码级别自动检测固件程序中存在的软件漏洞。本文的研究内容如下:首先,采用代码属性图的表示形式有效地表征各种类型的漏洞,通过提取其中的控制依赖、数据依赖和语法结构等语义信息,检测多种漏洞类型。然后,根据程序依赖图中的控制依赖和数据依赖找到与程序敏感点具有依赖关系的节点,利用程序切片技术剔除冗余节点,将剩余节点所构成的子图作为切片之后的代码属性图。最后,采用图卷积神经网络学习源代码的图结构信息。在图卷积神经网络模型的基础上引入了注意力机制,使得模型能够更好地处理节点之间的关系。此外,在图卷积网络模型的基础上设计了一种双向图卷积神经网络,用于捕获漏洞的上下文信息,解决了图卷积网络在处理代码时会丢失部分语义信息的缺陷。

为了验证提出的方法的有效性,对从SARD收集的软件漏洞数据集和真实世界漏洞数据集进行实验验证,实验结果表明,漏洞检测精度和F1分数最高达到了93.4%和86.54%,可以显著提高软件漏洞的检测能力,证明所提方法在源代码级的自动化漏洞检测非常有效。

论文外文摘要:

With the variety and quantity of embedded devices are increasing, its security is facing great challenges. Attackers take advantage of software vulnerabilities in the firmware of embedded devices to obtain device control rights, thus stealing user data information or spreading a large number of malicious code, which poses a great threat to embedded devices. Therefore, how to quickly and accurately discover the system security vulnerabilities of embedded devices has been a research hotspot in the field of information security.

Machine learning-based vulnerability detection methods need to convert the source code into specific representations that only retain part of the syntactic semantic information in the source code and cannot cover various vulnerability types. In addition, In order to solve the above problems, this thesis proposes a firmware program vulnerability detection method based on code attribute graph and graph neural network is proposed to automatically detect software vulnerabilities in firmware programs from the source code level.The research content of this thesis is as follows: Firstly,effectively characterize various types of vulnerabilities using the representation of code attribute graphs. Multiple vulnerability types are detected by extracting semantic information such as control dependencies, data dependencies, and syntax structures. Then, the program slicing technology is used to find the node with the dependency relationship with the sensitive point according to the control dependency and data dependency in the program dependency graph, eliminate the redundant nodes, and use the subgraph composed of the remaining nodes as the code attribute map after slicing. Finally, a graph convolutional neural network is used to learn the graph structure information of the source code. On the basis of the graph convolutional neural network model, attention mechanism is introduced to enable the model to better handle the relationships between nodes. In addition, based on the graph convolutional network model, this thesis proposes a two-way graph convolutional neural network to capture the context information of vulnerabilities, which solves the defect that graph convolutional networks lose part of the semantic information when processing code.

In order to verify the effectiveness of the proposed method, the software vulnerability dataset and real-world vulnerability dataset collected from SARD are experimentally verified, and the experimental results show that the vulnerability detection accuracy and F1 score reach up to 93.4% and 86.54%, which can significantly improve the detection ability of software vulnerabilities, and prove that the proposed method is very effective in automatic vulnerability detection at the source code level.

参考文献:

[1]郑尧文, 文辉, 程凯, 等. 物联网设备漏洞挖掘技术研究综述[J]. 信息安全学报, 2019, 4(5): 15.

[2]李登, 尹青, 林键, 等. 基于同源性分析的嵌入式设备固件漏洞检测[J]. 计算机工程, 2017, 43(01): 72-78.

[3]王彦博, 滕国锋, 王晓宇, 等. 波音787飞机进近时出现PACKALTI-TUDE limit信息原因分析及防范措施[J]. 航空维修与工程, 2021(07): 103-106.

[4]Antonakakis M, April T, Bailey M, et al. Understanding the Mirai botnet[C]// 26th USE-NIX Security Symposium, 2017: 1093-1110.

[5]李珍, 邹德清, 王泽丽, 等. 面向源代码的软件漏洞静态检测综述[J]. 网络与信息安全学报, 2019, 5(01): 1-14.

[6]于颖超, 陈左宁, 甘水滔, 等. 嵌入式设备固件安全分析技术研究[J]. 计算机学报, 2021, 44(05): 859-881.

[7]Qasem A, Shirani P, Debbabi M, et al. Automatic vulnerability detection in embedded devices and firmware: survey and layered taxonomies[J]. ACM Computing Surveys, 2021, 54(2): 1-42.

[8]Shoshitaishvili Yan, Ruoyu Wang, Christophe Hauser, et al. Firmalice automatic detection of authentication bypass vulnerabilities in binary firmware[C]// Network and Distributed System Security. California: Couputer Science, 2015.

[9]Kai Cheng, Qiang Li, Lei Wang, et al. DTaint: Detecting the Taint-Style vulnerability in embedded device firmware[C]// International Conference on Dependable Systems and Networks. Luxembourg: IEEE, 2018: 430-441.

[10]Nilo Redini, Aravind Machiry, Ruoyu Wang, et al. Karonte: Detecting insecure multibinary interactions in embedded firmware[J]. Symposium on Security and Privacy, 2020: 1544-1561.

[11]李韵, 黄辰林, 王中锋, 等. 基于机器学习的软件漏洞挖掘方法综述[J]. 软件学报, 2020, 31(7): 2040-2061.

[12]Coverity: Coverity scan static analysis. 2020. https://scan.coverity.com/.

[13]KlockWork: Static code analysis for C, C++, C#, and Java. 2020. https: //www.perforce.com/products/klocwork.

[14]Gao Qing, Ma Sen, Shao Sihao, et al. CoBOT: Static C/C++ bug detection in the presence of incomplete code[C]// 2018 IEEE/ACM 26th International Conference on Program Comprehension. New York: Association for Computing Machinery, 2018: 385-388.

[15]段旭, 吴敬征, 罗天悦, 等. 基于代码属性图及注意力双向LSTM的漏洞挖掘方法[J]. 软件学报, 2020, 31(11): 3404-3420.

[16]LibFuzzer: A library for coverage-guided fuzz testing. 2020. http://llvm.org/docs/LibFuzzer.html.

[17]AFL: American fuzzy lop. 2020. https://lcamtuf.coredump.cx/afl/.

[18]Katerina Goseva-Popstojanova, Andrei Perhinschi. On the capability of static code analysis to detect security vulnerabilities[J]. Information and Software Technology, 2015, 68.

[19]Seokmo Kim, R. Young Chul Kim, Young B. Park. Software vulnerability detection methodology combined with static and dynamic analysis[J]. Wireless Personal Communications, 2016, 89(3).

[20]Gustavo Grieco, Guillermo Luis Grinblat, Lucas Uzal, et al. Toward Large-Scale vulnerability discovery using machine learning[C]// In Proceedings of the Sixth ACM on Conference on Data and Application Security and Privacy. New Orleans: Data and Application Security and Privacy, 2016: 85-96.

[21]Boris Chernis, Rakesh Verma. Machine learning methods for software vulnerability detection[C]// In Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics. New York: Association for Computing Machinery, 2018: 31-39.

[22]Anbiya D R, Purwarianti A, Asnar Y. Vulnerability detection in PHP web application using lexical analysis approach with machine learning[C]// International Conference on Data and Software Engineering. Indonesia, 2018: 1-6.

[23]Pang Yulei, Xue Xiaozhen, Wang Huaying. Predicting vulnerable software components through deep neural network[C]// International Conference on Deep Learning Technologies. New York: Association for Computing Machinery, 2017: 6-10

[24]Russell R, Kim L, Hamilton L, et al. Automated vulnerability detection in source code using deep representation learning[C]// International Conference on Machine Learning and Applications.Orlando: IEEE Transactions on Reliability, 2018: 757-762.

[25]X Ban, Liu S, Chen C, et al. A performance evaluation of deep learnt features for software vulnerability detection[J]. Concurrency and Computation: Practice and Experience, 2019.

[26]Huang Guoyang, Li Yazhou, Wang Qian, et al. Automatic classification method for software vulnerability based on deep neural network[J]. IEEE Access, 2019, 7: 28291-28298.

[27]Li R, Feng C, Zhang X, et al. A lightweight assisted vulnerability discovery method using deep neural networks[J]. IEEE Access, 2019.

[28]Tian J, Xing W, Li Z. BVDetector: A program slice based binary code vulnerability intelligent detection system[J]. Information and Software Technology, 2020, 123: 106289.

[29]W Fang, J Wang, J Liu, et al. Vulnerability detection with deep learning.[C]// International Conference on Computer and Communication(ICCC). IEEE, 2017.

[30]Weina Niu, Xiaosong Zhang, Xiaojiang Du,et al. A deep learning based static taint analysis approach for IoT software vulnerability location[J]. Measurement, 2020, 152.

[31]陈皓, 易平. 基于图神经网络的代码漏洞检测方法[J]. 网络与信息安全学报, 2021, 7(03):37-45.

[32]Feng Qi, Feng Chendong, Hong Weijiang. Graph neural network based vulnerability predication[C]// 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). Adelaide: IEEE Consumer Electronics Magazine, 2020: 800-801.

[33]Cui Lei, Hao Zhiyu, Jiao Yang, et al. VulDetector: Detecting vulnerabilities using weighted feature graph comparison[J]. IEEE Transactions on Information Forensics and Security, 2020, 16: 2004-2017.

[34]Li Zhen, Zou Deqing, Xu Shouhuai, et al. VulDeePecker:A deep learning based system for vulnerability detection[J]. arXiv preprint arXiv:1801.01681, 2018.

[35]Wu Yuelong, Lu Jintian, Zhang Yunyi, et al. Vulnerability detection in C/C++ source code with graph representation learning[C]// 2021 IEEE 11th Annual Computing and Communication Workshop and Conference. NV: IEEE, 2021: 1519-1524.

[36]Li Zhen, Zou Deqing, Xu Shouhuai, et al. SySeVR: A framework for using deep learning to detect software vulnerabilities[J]. IEEE Transactions on Dependable and Secure Computing, 2018, 1941-0018.

[37]Cheng Xiao, Wang Haoyu, Hua Jiayi, et al. DeepWukong: Statically detecting software vulnerabilities using deep graph neural network[J]. ACM Transactions on Software Engineering and Methodology 2021, 30(3):1049-331X.

[38]Zhou Yaqin, Liu Shangqing, J.Siow, et al. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks[J]. ArXiv, 2019, abs/1909.03496: n.page.

[39]胡璇, 陈俊名, 李海峰. 基于本体的软件安全漏洞模式研究[J/OL]. 北京航空航天大学学报, 2023, 1-16.

[40]孙鸿宇, 何远, 王基策等. 人工智能技术在安全漏洞领域的应用[J]. 通信学报,2018, 39(08): 1-17.

[41]Tiantian Ji, Wu Yue, Wang Chang, et al. The coming era of alphaHacking? A survey of automatic software vulnerability detection, exploitation and patching techniques.[C]// 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC). 2018: 53-60.

[42]张俊贤, 李舟军. 基于动态符号执行的C代码缓冲区溢出检测[J]. 北京邮电大学学报, 2016, 39(S1): 50-54.

[43]信息安全漏洞通报[J]. 中国信息安全, 2015, No.70(10): 104-108.

[44]王瑞鹏, 张旻, 黄晖等. 基于符号执行的格式化字符串漏洞自动验证方法研究[J]. 空军工程大学学报(自然科学版), 2021, 22(03): 82-88.

[45]田宇, 马朝阳, 赵昶宇. 嵌入式系统动态内存管理及故障检测[J]. 科技与创新, 2018, No.117(21): 6-8+11.

[46]顾绵雪, 孙鸿宇, 韩丹, 等.基于深度学习的软件安全漏洞挖掘[J]. 计算机研究与发展, 2021, 58(10): 2140-2162.

[47]Luca Mecenero, Ranindya Paramitha, Ivan Pashchenko, et al. Lightweight parsing and slicing for bug identification in C. [C]// In Proceedings of the 17th International Conference on Availability, Reliability and Security (ARES '22). New York, 2022, 1-10.

[48]Tomas Mikolov, Piotr Bojanowski, Edouard Grave, et al. Enriching word vectors with subword information[J]. Transactions of the Association for Computational Linguistics, 2017, 5: 135-146.

[49]Scarselli Franco, Marco Gori, Ah Chung Tsoi, et al. The graph neural network model.[C]// 20 IEEE Transactions on Neural Networks, 2009: 61-80.

[50]Tian Junfeng, Xing Wenjing, Li Zhen. BVDetector: A program slice-based binary code vulnerability intelligent detection system[J]. Information and Software Technology, 2020, 123: 106289.

[51]Yamaguchi F, Golde N, Arp D, et al. Modeling and discovering vulnerabilities with code property graphs[C]// 2014 IEEE Symposium on Security and Privacy.Berkeley: IEEE, 2014: 590-604.

[52]文敏, 王荣存, 姜淑娟. 基于关系图卷积网络的源代码漏洞检测[J]. 计算机应用, 2022, 1-8.

中图分类号:

 TP311.95    

开放日期:

 2023-06-15    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式