论文中文题名: | 多维度网络流量异常检测的研究 |
姓名: | |
学号: | 19207040021 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 081001 |
学科名称: | 工学 - 信息与通信工程 - 通信与信息系统 |
学生类型: | 硕士 |
学位级别: | 工学硕士 |
学位年度: | 2022 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 物联网安全 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2022-06-17 |
论文答辩日期: | 2022-06-06 |
论文外文题名: | Research on Multi-dimensional Network Traffic Anomaly Detection |
论文中文关键词: | 流量异常 ; 多维特征 ; CNN_LSTM ; MOEA/D-ADE-levy ; 优化超参数 |
论文外文关键词: | Traffic anomalies ; multi-dimensional features ; CNN_LSTM ; MOEA/D-ADE-levy ; optimizing hyperparameters |
论文中文摘要: |
网络流量作为信息传输的载体,其使用量随互联网普及率增长,在网民享受高速流量带来便捷通讯的同时,许多不法分子,利用网络漏洞,对网络进行攻击,导致流量异常,给人民经济造成严重损失。网络攻击行为因其具有不确定性导致网络流量异变产生的陡然性,异常原因甄别困难的问题,并且网络流量本身存在高维、复杂、多样的特点,对网络流量各维度进行统一处理无法发掘复杂高维流量之间的网络拓扑信息和基于时间变化信息的关系,以及网络拓扑信息和时间变化信息之间隐藏的关联关系。 针对以上问题本文将复杂多维的网络流量划分为时间、空间两部分特征进行分别处理,利用CNN处理网络流量的源IP到目的主机的跳数、传输协议等空间特征数据,提取数据之间局部信息,降低高维复杂度;利用LSTM时间记忆性提取网络流量数据的时间特性;使用注意力机制自适应权重分配,处理时、空特征之间的关联关系,因此,本文提出基于CNN_LSTM的双通道网络流量异常检测模型。 然而CNN模型在进行网络流量空间特征学习时存在超参数不易选择的问题,人为调参会耗费大量计算资源,因此本文提出一种随机扰动不同种群变异策略的多目标进化算法(MOEA/D-ADE-levy)对CNN的超参数调优。该算法将CNN的超参数作为MOEA/D-ADE-levy的输入,以损失函数作为适应度函数,首先利用混合水平正交实验得到均匀分布的权重向量,并将此权重向量应用于改进切比雪夫机制分解子问题得到均匀分布的初始种群即初始CNN超参数;其次将超参数得到的适应度函数按照大小分为优秀个体,中间个体和较差个体,对不同个体采用不同的变异策略,对变异因子F和交叉概率CR采用自适应机制,提高非支配解集的收敛性和多样性;最后对陷入局部最优的解集增加levy随机扰动,增大其全局搜索的能力,跳出局部最优,经过数次迭代得到最优网络超参数。 经过调参的CNN模型结合LSTM的双通道网络流量异常检测模型,能够对时空特征进行学习,最终经过softmax分类器对网络流量进行异常检测。对模型识别出来的攻击类型造成的流量异常可以及时采取措施去阻止本次攻击,保护网络环境安全。本文将MOEA/D-ADE-levy优化的CNN模型和不使用优化的CNN模型结合LSTM对网络流量异常检测以及单一网络流量异常检测的正确率、精确率和召回率进行对比,所提方法对网络流量异常检测的正确率高于其他几种方法,并且模型分类的正确率高达98.9%。 |
论文外文摘要: |
As the carrier of information transmission, the usage of network traffic increases with the penetration rate of the Internet. While netizens enjoy the convenient communication brought by high-speed traffic, many criminals exploit network loopholes to attack the network, resulting in abnormal traffic flow and causing damage to the people's economy withserious loss. Due to the uncertainty of network attack behavior, the sudden change of network traffic and the difficulty in identifying the abnormal cause, and the network traffic itself is high-dimensional, complex, and diverse, it is impossible to unify the processing of all dimensions of network traffic. The relationship between network topology information and time-varying information between complex high-dimensional traffic, as well as the hidden correlation between network topology information and time-varying information. In view of the above problems, this paper divides the complex and multi-dimensional network traffic into two parts of time and space for processing separately, and uses CNN to process the spatial feature data and extract the data such as the number of hops from the source IP to the destination host and the transmission protocol of the network traffic, whic can reduce high-dimensional complexity of local information between data ; LSTM time memory to extract the temporal characteristics of network traffic data; uses the attention mechanism to adaptively assign weights to deal with the correlation between temporal and spatial features. Therefore, a method based on a two-channel network traffic anomaly detection model for CNN_LSTM is proposed by this paper. However, the CNN model has the problem of difficult selection of hyperparameters when learning the spatial characteristics of network traffic, and artificial parameter adjustment will consume a lot of computing resources. Therefore, a multi-objective evolutionary algorithm that randomly disturbs different population mutation strategies to solve hyperparameter tuning for CNNs is proposed by this paper. The algorithm uses the hyperparameters of CNN as the input of MOEA/D-ADE-levy, and uses the loss function as the fitness function. First, a uniformly distributed weight vector is obtained by using a mixed horizontal orthogonal experiment, and this weight vector is used to improve the cut ratio. The Scheff mechanism decomposes the sub-problems to obtain a uniformly distributed initial population, that is, the initial CNN hyperparameters; secondly, the fitness function obtained by the hyperparameters is divided into excellent individuals, intermediate individuals and poor individuals according to the size, and different mutation strategies are used for different individuals. The mutation factor F and the crossover probability CR adopt an adaptive mechanism to improve the convergence and diversity of the non-dominated solution set; finally, the levy random disturbance is added to the solution set that falls into the local optimum, which increases its global search ability and jumps out of the local optimum. , and after several iterations, the optimal network hyperparameters are obtained. The adjusted CNN model combined with the two-channel network traffic anomaly detection model of LSTM can learn spatiotemporal features, and finally detect network traffic anomalies through the softmax classifier. For the abnormal traffic caused by the attack type identified by the model, measures can be taken in time to prevent the attack and protect the network environment security. This paper compares the correct rate, precision rate and recall rate of the MOEA/D-ADE-levy optimized CNN model and the CNN model without optimization combined with LSTM for network traffic anomaly detection and single network traffic anomaly detection. The correct rate of traffic anomaly detection is higher than several other methods, and the correct rate of model classification is as high as 98.9%. |
参考文献: |
[1]第46次《中国互联网络发展状况统计报告》发布[J].中国广播,2020(11): 54-54. [3] 徐玉华,孙知信. 软件定义网络中的异常流量检测研究进展[J]. 软件学报,2020,31(01): 183-207. [4] 李杰铃,张浩.半监督异常流量检测研究综述[J].小型微型计算机系统,2020,41(11):2371-2379. [10] 耿海军,王威,王浩,罗舒婷,尹霞.互联网域内流量工程综述[J].小型微型计算机系统,2021,42(09):1891-1899. [11] 张志宏,刘传领.基于灰狼算法优化深度学习网络的网络流量预测[J].吉林大学学报(理学版),2021,59(03):619-626. [12] 陈铁明,金成强,吕明琪,朱添田.基于样本增强的网络恶意流量智能检测方法[J].通信学报,2020,41(06):128-138. [22]任家东,刘新倩,王倩,等. 基于KNN离群点检测和随机森林的多层入侵检测方法[J]. 计算机研究与发展, 2019, 056(003):566-575. [23]吕赵明, 张颖江. 基于改进GOA-SVM算法的异常流量识别[J]. 湖南科技大学学报:自然科学版, 2019, 34(4):90-96. [26]李佳,云晓春,李书豪,等. 基于混合结构深度神经网络的HTTP恶意流量检测方法[J]. 通信学报, 2019, 40(1):24-33. [28]张龙,王劲松. SDN中基于信息熵与DNN的DDoS攻击检测模型[J]. 计算机研究与发展,2019, 56(5):909-918. [29]李玉娟.基于改进粒子群算法的深度学习超参数优化方法[J].信息通信,2020(01):52-53+55. [30]邓帅.基于改进贝叶斯优化算法的CNN超参数优化方法[J].计算机应用研,2019,36(07):1984-1987. [32]徐澄宇. 分布式拒绝服务攻击研究[J]. 2021(2013-31):60-60. [38]毛杰文. 数据中心网络的异常流量检测及探针优化部署研究[D].华东师范大学,2021. [39]张小莉,程光,张慰慈.基于改进深度卷积神经网络的网络流量分类方法[J].中国科学:信息科学,2021,51(01):56-74. [40]郭佳丽,邢双云,栾昊,贾艳婷.基于改进的LSTM算法的时间序列流量预测[J].南京信息工程大学学报(自然科学版), 2021, 13(05):571-575. [44]刘永利,朱亚孟,晁浩.多策略MRFO算法的卷积神经网络超参数优化[J].北京邮电大学学报,2021,44(06):83-88+95. [45]谢承旺, 余伟伟,闭应洲,等. 一种基于分解和协同的高维多目标进化算法[J]. 软件学报,2020, 031(002):356-373. [46]谢承旺,肖驰,丁立新,等.HMOFA:一种混合型多目标萤火虫算法[J].软件学报,2018,29(04):1143-1162. [49]李志军. 基于Sobol序列和间歇Levy跳跃的改进蝙蝠算法[J]. 数学的实践与认识,2021,51(8):313-320. [54]刘元,郑金华,邹娟,等. 基于邻域竞赛的多目标优化算法[J]. 自动化学报,2018, 44(7):1304-1320. |
中图分类号: | TP393.08 |
开放日期: | 2022-06-20 |