- 无标题文档
查看论文信息

论文中文题名:

 多维度网络流量异常检测的研究    

姓名:

 汪连连    

学号:

 19207040021    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 081001    

学科名称:

 工学 - 信息与通信工程 - 通信与信息系统    

学生类型:

 硕士    

学位级别:

 工学硕士    

学位年度:

 2022    

培养单位:

 西安科技大学    

院系:

 通信与信息工程学院    

专业:

 通信与信息系统    

研究方向:

 物联网安全    

第一导师姓名:

 郝秦霞    

第一导师单位:

 西安科技大学    

论文提交日期:

 2022-06-17    

论文答辩日期:

 2022-06-06    

论文外文题名:

 Research on Multi-dimensional Network Traffic Anomaly Detection    

论文中文关键词:

 流量异常 ; 多维特征 ; CNN_LSTM ; MOEA/D-ADE-levy ; 优化超参数    

论文外文关键词:

 Traffic anomalies ; multi-dimensional features ; CNN_LSTM ; MOEA/D-ADE-levy ; optimizing hyperparameters    

论文中文摘要:

网络流量作为信息传输的载体,其使用量随互联网普及率增长,在网民享受高速流量带来便捷通讯的同时,许多不法分子,利用网络漏洞,对网络进行攻击,导致流量异常,给人民经济造成严重损失。网络攻击行为因其具有不确定性导致网络流量异变产生的陡然性,异常原因甄别困难的问题,并且网络流量本身存在高维、复杂、多样的特点,对网络流量各维度进行统一处理无法发掘复杂高维流量之间的网络拓扑信息和基于时间变化信息的关系,以及网络拓扑信息和时间变化信息之间隐藏的关联关系。

针对以上问题本文将复杂多维的网络流量划分为时间、空间两部分特征进行分别处理,利用CNN处理网络流量的源IP到目的主机的跳数、传输协议等空间特征数据,提取数据之间局部信息,降低高维复杂度;利用LSTM时间记忆性提取网络流量数据的时间特性;使用注意力机制自适应权重分配,处理时、空特征之间的关联关系,因此,本文提出基于CNN_LSTM的双通道网络流量异常检测模型。

然而CNN模型在进行网络流量空间特征学习时存在超参数不易选择的问题,人为调参会耗费大量计算资源,因此本文提出一种随机扰动不同种群变异策略的多目标进化算法(MOEA/D-ADE-levy)对CNN的超参数调优。该算法将CNN的超参数作为MOEA/D-ADE-levy的输入,以损失函数作为适应度函数,首先利用混合水平正交实验得到均匀分布的权重向量,并将此权重向量应用于改进切比雪夫机制分解子问题得到均匀分布的初始种群即初始CNN超参数;其次将超参数得到的适应度函数按照大小分为优秀个体,中间个体和较差个体,对不同个体采用不同的变异策略,对变异因子F和交叉概率CR采用自适应机制,提高非支配解集的收敛性和多样性;最后对陷入局部最优的解集增加levy随机扰动,增大其全局搜索的能力,跳出局部最优,经过数次迭代得到最优网络超参数。

经过调参的CNN模型结合LSTM的双通道网络流量异常检测模型,能够对时空特征进行学习,最终经过softmax分类器对网络流量进行异常检测。对模型识别出来的攻击类型造成的流量异常可以及时采取措施去阻止本次攻击,保护网络环境安全。本文将MOEA/D-ADE-levy优化的CNN模型和不使用优化的CNN模型结合LSTM对网络流量异常检测以及单一网络流量异常检测的正确率、精确率和召回率进行对比,所提方法对网络流量异常检测的正确率高于其他几种方法,并且模型分类的正确率高达98.9%。

论文外文摘要:

As the carrier of information transmission, the usage of network traffic increases with the penetration rate of the Internet. While netizens enjoy the convenient communication brought by high-speed traffic, many criminals exploit network loopholes to attack the network, resulting in abnormal traffic flow and causing damage to the people's economy withserious loss. Due to the uncertainty of network attack behavior, the sudden change of network traffic and the difficulty in identifying the abnormal cause, and the network traffic itself is high-dimensional, complex, and diverse, it is impossible to unify the processing of all dimensions of network traffic. The relationship between network topology information and time-varying information between complex high-dimensional traffic, as well as the hidden correlation between network topology information and time-varying information.

In view of the above problems, this paper divides the complex and multi-dimensional network traffic into two parts of time and space for processing separately, and uses CNN to process the spatial feature data and extract the data such as the number of hops from the source IP to the destination host and the transmission protocol of the network traffic, whic can reduce high-dimensional complexity of local information between data ; LSTM time memory to extract the temporal characteristics of network traffic data; uses the attention mechanism to adaptively assign weights to deal with the correlation between temporal and spatial features. Therefore, a method based on a two-channel network traffic anomaly detection model for CNN_LSTM is proposed by this paper.

However, the CNN model has the problem of difficult selection of hyperparameters when learning the spatial characteristics of network traffic, and artificial parameter adjustment will consume a lot of computing resources. Therefore, a multi-objective evolutionary algorithm that randomly disturbs different population mutation strategies to solve hyperparameter tuning for CNNs is proposed by this paper. The algorithm uses the hyperparameters of CNN as the input of MOEA/D-ADE-levy, and uses the loss function as the fitness function. First, a uniformly distributed weight vector is obtained by using a mixed horizontal orthogonal experiment, and this weight vector is used to improve the cut ratio. The Scheff mechanism decomposes the sub-problems to obtain a uniformly distributed initial population, that is, the initial CNN hyperparameters; secondly, the fitness function obtained by the hyperparameters is divided into excellent individuals, intermediate individuals and poor individuals according to the size, and different mutation strategies are used for different individuals. The mutation factor F and the crossover probability CR adopt an adaptive mechanism to improve the convergence and diversity of the non-dominated solution set; finally, the levy random disturbance is added to the solution set that falls into the local optimum, which increases its global search ability and jumps out of the local optimum. , and after several iterations, the optimal network hyperparameters are obtained.

The adjusted CNN model combined with the two-channel network traffic anomaly detection model of LSTM can learn spatiotemporal features, and finally detect network traffic anomalies through the softmax classifier. For the abnormal traffic caused by the attack type identified by the model, measures can be taken in time to prevent the attack and protect the network environment security. This paper compares the correct rate, precision rate and recall rate of the MOEA/D-ADE-levy optimized CNN model and the CNN model without optimization combined with LSTM for network traffic anomaly detection and single network traffic anomaly detection. The correct rate of traffic anomaly detection is higher than several other methods, and the correct rate of model classification is as high as 98.9%.

参考文献:

[1]第46次《中国互联网络发展状况统计报告》发布[J].中国广播,2020(11): 54-54.

[2] Imperva Research Labs. DDoS Attacks in the Time of COVID-19[EB/OL]. (2020-03-01)[2021-03-19].https://www.imperva.com/resources/reports/Imperva_DDoSAttacksCOVID-19_Report_20201217.pdf.

[3] 徐玉华,孙知信. 软件定义网络中的异常流量检测研究进展[J]. 软件学报,2020,31(01): 183-207.

[4] 李杰铃,张浩.半监督异常流量检测研究综述[J].小型微型计算机系统,2020,41(11):2371-2379.

[5] Hwang R H ,Peng M C ,Huang C W ,et al. An Unsupervised Deep Learning Model for Early Network Traffic Anomaly Detection[J]. IEEE Access,2020, 8(2020):30387-30399.

[6] Dandil E . C-NSA: A Hybrid Approach Based on Articial Immune Algorithms for Anomaly Detection in Web Traffic[J]. IET Information Security,2020,14(3):609-619.

[7] Zhao X , Huang G , Jiang J , et al. Research on Lightweight Anomaly Detection of Multimedia Traffic in Edge Computing[J]. Computers & Security, 2021(2):102463-102478.

[8] Ma C ,Du X ,Cao L . Analysis of Multi-Types of Flow Features Based on Hybrid Neural Network for Improving Network Anomaly Detection[J]. IEEE Access, 2019, 7(2019):148363-148380.

[9] Zavrak S ,Iskefiyeli M . Anomaly-Based Intrusion Detection From Network Flow Features Using Variational Autoencoder[J]. IEEE Access,2020, 8(99):108346-108358.

[10] 耿海军,王威,王浩,罗舒婷,尹霞.互联网域内流量工程综述[J].小型微型计算机系统,2021,42(09):1891-1899.

[11] 张志宏,刘传领.基于灰狼算法优化深度学习网络的网络流量预测[J].吉林大学学报(理学版),2021,59(03):619-626.

[12] 陈铁明,金成强,吕明琪,朱添田.基于样本增强的网络恶意流量智能检测方法[J].通信学报,2020,41(06):128-138.

[13] Kwon S ,Yoo H , Shon T . IEEE 1815.1-based Power System Security with Bidirectional RNN-based Network Anomalous Attack Detection for Cyber-Physical System[J]. IEEE Access, 2020, PP(99):77572-77586.

[14]Buczak A ,Guven E . A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection[J]. IEEE Communications Surveys & Tutorials,2017,18(2):1153-1176.

[15]Aldweesh A ,Derhab A ,Emam A Z . Deep learning approaches for anomaly-based intrusion detection systems: A survey,taxonomy,and open issues[J]. Knowledge-Based Systems,2019, 189(2019):105124-105139.

[16]Debar H ,Dacier M ,Wespi A . Towards a taxonomy of intrusion-detection systems[J]. Computer Networks, 1999, 31(8):805-822.

[17]D. E. Denning. An Intrusion-Detection Model[J]. IEEE Trans on Software Engineering, 2006, SE-13(2):222-232.

[18]Chouhan N ,Khan A ,Haroon-ur-Rasheed. Network anomaly detection using channel boosted and residual learning based deep convolutional neural network[J]. Applied Soft Computing, 2019, 83(2019):105612-105626.

[19]Preda M , Bica I , Patriciu V V . Internet of Things Traffic Characterization using flow and packet analysis[C]// 2020 12th International Conference on Electronics, Computers and Artificial Intelligence (ECAI). 2020:1-7.

[20]Al-Sanjary O I , Roslan M , Helmi R , et al. Comparison and Detection Analysis of Network Traffic Datasets Using K-Means Clustering Algorithm[J]. Journal of Information & Knowledge Management, 2020,2020(3):1-22.

[21]Lopez,Alma, D. Network Traffic Behavioral Analytics for Detection of DDoS Attacks[J]. Smu Data Science Review, 2019:337-361.

[22]任家东,刘新倩,王倩,等. 基于KNN离群点检测和随机森林的多层入侵检测方法[J]. 计算机研究与发展, 2019, 056(003):566-575.

[23]吕赵明, 张颖江. 基于改进GOA-SVM算法的异常流量识别[J]. 湖南科技大学学报:自然科学版, 2019, 34(4):90-96.

[24]X Tao, Peng Y , Zhao F, et al. An Improved Parallel Network Traffic Anomaly Detection Method Based on Bagging and GRU[C]// International Conference on Wireless Algorithms,Systems,and Applications. Springer,Cham, 2020:420-431.

[25]Chouhan N ,Khan A, Haroon-ur-Rasheed. Network anomaly detection using channel boosted and residual learning based deep convolutional neural network[J]. Applied Soft Computing, 2019, 83(2019):105612-105612.

[26]李佳,云晓春,李书豪,等. 基于混合结构深度神经网络的HTTP恶意流量检测方法[J]. 通信学报, 2019, 40(1):24-33.

[27]Kumar J A , Abirami S . Ensemble application of bidirectional LSTM and GRU for aspect category detection with imbalanced data[J]. Neural Computing and Applications, 2021,33(21):1-19.

[28]张龙,王劲松. SDN中基于信息熵与DNN的DDoS攻击检测模型[J]. 计算机研究与发展,2019, 56(5):909-918.

[29]李玉娟.基于改进粒子群算法的深度学习超参数优化方法[J].信息通信,2020(01):52-53+55.

[30]邓帅.基于改进贝叶斯优化算法的CNN超参数优化方法[J].计算机应用研,2019,36(07):1984-1987.

[31]Dahiya B P, Rani S ,Singh P . A Hybrid Artificial Grasshopper Optimization (HAGOA) Meta-Heuristic Approach: A Hybrid Optimizer For Discover the Global Optimum in Given Search Space[J]. International Journal of Mathematical, Engineering and Management Sciences, 2019,4(2):471-488.

[32]徐澄宇. 分布式拒绝服务攻击研究[J]. 2021(2013-31):60-60.

[33]Chang G F . Random Key Pre-Distribution Simulation of Clustered Structure Cross-layer Connection Network[J]. Computer Simulation, 2019, 36(11): 260-263+267.

[34]Zhang Y , Kong W , Dong Z Y , et al. Short-Term Residential Load Forecasting Based on LSTM Recurrent Neural Network[J]. IEEE Transactions on Smart Grid, 2019,10(1):841-851.

[35]Zhang G , Zhang R ,Zhou G ,et al. Correction: Hierarchical spatial features learning with deep CNNs for very high-resolution[J]. International journal of remote sensing,2019,40(5-6):2466-2466.

[36]Tripathy S K, Srivastava R . A real-time two-input stream multi-column multi-stage convolution neural network (TIS-MCMS-CNN) for efficient crowd congestion-level analysis[J]. Multimedia Systems, 2020(6088):585-605.

[37]Ibtissam Benchaji,Samira Douzi,Bouabid El Ouahidi. Credit Card Fraud Detection Model Based on LSTM Recurrent Neural Networks[J]. JAIT,2021,12(2):113-118.

[38]毛杰文. 数据中心网络的异常流量检测及探针优化部署研究[D].华东师范大学,2021.

[39]张小莉,程光,张慰慈.基于改进深度卷积神经网络的网络流量分类方法[J].中国科学:信息科学,2021,51(01):56-74.

[40]郭佳丽,邢双云,栾昊,贾艳婷.基于改进的LSTM算法的时间序列流量预测[J].南京信息工程大学学报(自然科学版), 2021, 13(05):571-575.

[41]Peng H ,Du B ,Liu M ,et al. Dynamic Graph Convolutional Network for Long-Term Traffic Flow Prediction with Reinforcement Learning[J]. Information Sciences, 2021,578(2021):401-416.

[42]He Q , Liu W , Cai Z . B&Anet: Combining bidirectional LSTM and self-attention for end-to-end learning of task-oriented dialogue system[J]. Speech Communication, 2020, 125(1):15-23.

[43]Singh P ,Chaudhury S,Panigrahi B K . Hybrid MPSO-CNN: Multi-level Particle Swarm optimized Hyperparameters of Convolutional Neural Network[J]. Swarm and Evolutionary Computation,2021(10):100863-100867.

[44]刘永利,朱亚孟,晁浩.多策略MRFO算法的卷积神经网络超参数优化[J].北京邮电大学学报,2021,44(06):83-88+95.

[45]谢承旺, 余伟伟,闭应洲,等. 一种基于分解和协同的高维多目标进化算法[J]. 软件学报,2020, 031(002):356-373.

[46]谢承旺,肖驰,丁立新,等.HMOFA:一种混合型多目标萤火虫算法[J].软件学报,2018,29(04):1143-1162.

[47]Zhun Fan,Wenji Li,Xinye Cai,Han Huang,Yi Fang,Yugen You,Jiajie Mo,Caimin Wei,Erik Goodman. An improved epsilon constraint-handling method in MOEA/D for CMOPs with large infeasible regions[J]. Soft Computing,2019,23(23):12491-12510.

[48]Christiaan Scheepers,Andries P. Engelbrecht,Christopher W. Cleghorn. Multi-guide particle swarm optimization for multi-objective optimization: empirical and stability analysis[J]. Swarm Intelligence,2019,13(3-4):245-276.

[49]李志军. 基于Sobol序列和间歇Levy跳跃的改进蝙蝠算法[J]. 数学的实践与认识,2021,51(8):313-320.

[50]SAUL ZAPOTECAS MARTINEZ, CARLOS A. COELLO COELLO. A Proposal to Hybridize Multi-Objective Evolutionary Algorithms with Non-gradient Mathematical Programming Techniques[C]. //10th International Conference on Parallel Problem Solving from Nature (PPSN 2008). :837-846.

[51]Cui Z , Zhao L , Zeng Y , et al. Novel PIO Algorithm with Multiple Selection Strategies for Many-Objective Optimization Problems[J]. Complex System Modeling and Simulation, 2021, 1(4):291-307.

[52]Zhongbao Zhou,Xianghui Liu,Helu Xiao,Shijian Wu,Yueyue Liu. A DEA-based MOEA/D algorithm for portfolio optimization[J]. Cluster Computing,2019,22(6):14477-14486.

[53]Zheng Wei,Tan Yanyan,Meng Lili. An improved MOEA/D design for many-objective optimization problems[J]. Applied Intelligence,2018, 48(10): 3839–3861.

[54]刘元,郑金华,邹娟,等. 基于邻域竞赛的多目标优化算法[J]. 自动化学报,2018, 44(7):1304-1320.

[55]Gu Q ,Wang R , Xie H , et al. Modified non-dominated sorting genetic algorithm III with fine final level selection[J]. Applied Intelligence, 2021, 51(5):1-34.

[56]Anwar A A , Younas I . Optimization of Many Objective Pickup and Delivery Problem with Delay Time of Vehicle Using Memetic Decomposition Based Evolutionary Algorithm[J]. International Journal of Artificial Intelligence Tools, 2020,29(1):2050003-20500037.

[57]Ghannami A , Li J , Hawbani A , et al. Diversity Metrics for Direct-Coded Variable-Length Chromosome Shortest Path Problem Evolutionary Algorithms[J]. Computing, 2021, 103(6):1-20.

中图分类号:

 TP393.08    

开放日期:

 2022-06-20    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式