- 无标题文档
查看论文信息

论文中文题名:

 基于图卷积神经网络的社交机器人检测方法研究    

姓名:

 柏亮雪    

学号:

 21208223044    

保密级别:

 保密(1年后开放)    

论文语种:

 chi    

学科代码:

 085400    

学科名称:

 工学 - 电子信息    

学生类型:

 硕士    

学位级别:

 工程硕士    

学位年度:

 2024    

培养单位:

 西安科技大学    

院系:

 计算机科学与技术学院    

专业:

 软件工程    

研究方向:

 异常检测    

第一导师姓名:

 于振华    

第一导师单位:

 西安科技大学    

论文提交日期:

 2024-06-18    

论文答辩日期:

 2024-05-31    

论文外文题名:

 Research on Detection Method of Social Robot Based on Graph Convolutional Neural Networks    

论文中文关键词:

 社交网络 ; 社交机器人检测 ; 图卷积神经网络 ; 特征线性调制 ; 双向异构 网络    

论文外文关键词:

 Social network ; Social robot detection ; Graph convolutional networks ; Feature-wise linear modulation ; Bidirectional heterogeneous networks    

论文中文摘要:

在信息化时代,社交网络成为人们互相交流和获取信息的主要途径。社交机器人的出现引发了人们对网络安全和社会稳定的担忧,社交机器人能快速传播信息且操作便利,使它们逐渐成为个人或组织实施恶意行为的工具。具有潜在危害的恶意社交机器人不仅能够模仿真实用户的行为,还能传播虚假信息、发起网络攻击。这些恶意活动破坏网络空间的健康,威胁用户信息安全和隐私。因此,检测恶意社交机器人对维护公共安全和促进网络环境的健康发展至关重要。

随着社交机器人智能化程度不断提升,其检测工作面临着严峻挑战。现有社交机器人检测方法存在以下问题:社交网络节点数量庞大、关系复杂,处理时面临计算复杂度高的问题;社交关系网络用户的互动过程存在交互性,现有方法较少考虑互动过程的交互性。根据上述存在的问题,本文进行了如下研究:

(1)为解决社交网络节点多且关系复杂,难以准确描述节点拓扑关系之间的差异问题,提出了改进的图注意力卷积网络社交机器人检测方法。借助用户的社交关系网络构建社交关系子图,以便解耦复杂社交关系;构建特征线性调制的图注意力残差网络模型,引入特征线性调制模块以及残差结构,以增强对节点间差异性的学习能力;通过结合用户文本内容和行为基因序列,构建社交行为特征,并与社交关系子图特征融合,以实现对社交机器人的有效检测。实验结果表明,本文所提方法在公共数据集TwiBot-20和Cresci-15上的准确率分别提升2.2%和1.35%。

(2)考虑到恶意社交机器人的互动关系和方向性通常呈现单一性,而现有检测方法往往忽视用户间动态互动的问题,本文提出多重社交关系下的恶意社交机器人检测方法。该方法提出了基于特征增强的双向异构网络模型,通过考虑前向和反向传播方式学习节点特征,更广泛地学习用户之间的互动信息,更准确地捕捉节点之间的互动性。其次,为了进一步提升特征表达能力和模型的鲁棒性,本文在异构图卷积网络层中加入了特征增强模块。最后,通过将互动网络特征与文本特征进行融合以检测恶意社交机器人,并在交叉熵损失函数中加入梯度惩罚项,以防过拟合并增强模型稳定性。在Twibot-22和MGTAB数据集上的实验结果验证了所提方法的有效性,检测准确率分别提升了1.08%和1.54%。

(3)基于上述所提方法,本文设计并实现社交机器人检测系统。社交机器人检测系统主要包括五个模块,分别是数据采集、数据处理、特征分析、社交机器人检测及可视化模块。利用社交机器人检测系统,能够提高社交机器人和存在恶意性社交机器人的检测效率,缓解其对社交网络和用户的危害。

论文外文摘要:

In the information age, social network has become the main way for people to communicate with each other and obtain information. The emergence of social robots has raised concerns about cyber security and social stability. Social robots can spread information quickly and operate easily, making them increasingly tools for individuals or organizations to carry out malicious acts. Potentially harmful malicious social bots can not only mimic the behavior of real users, but also spread false information and launch cyber attacks. These malicious activities undermine the health of cyberspace and threaten users' information security and privacy. Therefore, the detection of malicious social robots is crucial to maintaining public safety and promoting the healthy development of the network environment.

With the continuous improvement of the intelligence of social robots, its detection work is facing serious challenges. The existing social robot detection methods have the following problems: the number of social network nodes is large, the relationship is complex, and the processing is faced with high computational complexity; The interaction of users in the social network is interactive, and the interaction of the interaction process is less considered in the existing methods. According to the above problems, this thesis conducted the following research:

(1) In order to solve the problem that it is difficult to accurately describe the differences between the topological relations of nodes in social networks due to the large number of nodes and complex relationships, an improved graph attention convolutional network social robot detection method is proposed. In order to decouple complex social relationships, the subgraph of social relationships is constructed by means of the user's social relationship network. In order to enhance the learning ability of the difference between nodes, the feature linear modulation module and residual structure are introduced to construct the graph attention residual network model. By combining user text content and behavioral gene sequence, social behavior features are constructed and fused with social relationship subgraph features to realize effective detection of social robots. Experimental results show that the accuracy of the proposed method on public data sets TwiBot-20 and Cresci-15 increases by 2.2% and 1.35%, respectively.

(2) Considering that the interaction relationship and directivity of malicious social robots are usually simple, and the existing detection methods often ignore the problem of dynamic interaction between users, this thesis proposes a detection method of malicious social robots under multiple social relationships. In this method, a bidirectional heterogeneous network model based on feature enhancement is proposed to learn node features by considering both forward and back propagation modes, learn the interaction information between users more extensively, and capture the interaction between nodes more accurately. Secondly, in order to further improve the feature representation ability and robustness of the model, this thesis adds a feature enhancement module to the heterogeneous graph convolutional network layer. Finally, by fusing interactive network features with text features to detect malicious social robots, a gradient penalty term is added to the cross-entropy loss function to prevent overfitting and enhance the stability of the model. Experimental results on Twibot-22 and MGTAB data sets validate the effectiveness of the proposed method, and the detection accuracy increases by 1.08% and 1.54%, respectively.

(3) Based on the above mentioned methods, a social robot detection system is designed and implemented in this thesis. The social robot detection system mainly consists of five modules, which are data acquisition, data processing, feature analysis, social robot detection and visualization module. The use of social robot detection system can improve the detection efficiency of social robots and malicious social robots, and alleviate their harm to social networks and users.

参考文献:

[1] 王庚润. 网络空间用户身份对齐技术研究及应用综述[J]. 计算机科学, 2024, 51(5): 12–20.

[2] Statista. Number of worldwide social network users [EB/OL]. (2023-08-29) [2024-02-18]. https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/

[3] We Are Social, Hootsuite. Global Digital 2019 reports [EB/OL]. (2023-01) [2024-2-18]. https://wearesocial.com/cn/blog/2023/01/digital-2023/

[4] 李元, 张栖, 朱建明等. 基于多层次深度模型的社交网络核心谣言传播节点识别[J]. 中国科学院大学学报, 2024, 41(1): 136–144.

[5] 龙光华. 基于情感和网络特征的社交机器人检测研究[D]. 南昌:南昌大学, 2023.

[6] Fukuda M, Nakajima K, Shudo K. Estimating the bot population on Twitter via random walk based sampling[J]. IEEE Access, 2022, 10: 17201–17211.

[7] Dunn A G, Surian D, Dalmazzo J, et al. Limited role of bots in spreading vaccine-critical information among active twitter users in the United States: 2017–2019[J]. American Journal of Public Health, 2020, 110(S3): S319–S325.

[8] Wang W, Shang Y, He Y, et al. BotMark: Automated botnet detection with hybrid analysis of flow-based and graph-based traffic behaviors[J]. Information Sciences, 2020, 511: 284–296.

[9] Shi P, Zhang Z, Choo K K R. Detecting malicious social bots based on clickstream sequences[J]. IEEE Access, 2019, 7: 28855–28862.

[10] Mendoza M, Providel E, Santos M, et al. Detection and impact estimation of social bots in the Chilean Twitter network[J]. Scientific Reports, 2024, 14(1): 6525–6545.

[11] Pratama P G, Rakhmawati N A. Social bot detection on 2019 Indonesia president candidate’s supporter’s tweets[J]. Procedia Computer Science, 2019, 161: 813–820.

[12] Wischnewski M, Ngo T, Bernemann R, et al. “I agree with you, bot!” How users (dis) engage with social bots on Twitter[J]. New Media & Society, 2024, 26(3): 1505–1526.

[13] Broniatowski D A, Jamison A M, Qi S H, et al. Weaponized health communication: Twitter bots and Russian trolls amplify the vaccine debate[J]. American Journal of Public Health, 2018, 108(10): 1378–1384.

[14] Hajli N, Saeed U, Tajvidi M, et al. Social bots and the spread of disinformation in social media: the challenges of artificial intelligence[J]. British Journal of Management, 2022, 33(3): 1238–1253.

[15] Zhang M, Qi X, Chen Z, et al. Social bots’ involvement in the COVID-19 vaccine discussions on Twitter[J]. International Journal of Environmental Research and Public Health, 2022, 19(3): 1651.

[16] Zhou M, Zhang D, Wang Y, et al. Detecting social bot on the fly using contrastive learning[C]//Proceedings of the 32nd ACM International Conference on Information and Knowledge Management. 2023: 4995–5001.

[17] 杨舟. 社交网络机器人检测综述[J]. 网络安全技术与应用, 2022, (3): 135–136.

[18] Cresci S. A decade of social bot detection[J]. Communications of the ACM, 2020, 63(10): 72–83.

[19] Hayawi K, Saha S, Masud M M, et al. Social media bot detection with deep learning methods: a systematic review[J]. Neural Computing and Applications, 2023, 35(12): 8903–8918.

[20] Alterkavı S, Erbay H. Novel authorship verification model for social media accounts compromised by a human[J]. Multimedia Tools and Applications, 2021, 80: 13575–13591.

[21] Aljabri M, Zagrouba R, Shaahid A, et al. Machine learning-based social media bot detection: a comprehensive literature review[J]. Social Network Analysis and Mining, 2023, 13(1): 20–59.

[22] Shevtsov A, Oikonomidou M, Antonakaki D, et al. Discovery and classification of Twitter bots[J]. SN Computer Science, 2022, 3(3): 255–303.

[23] Heidari M, James Jr H, Uzuner O. An empirical study of machine learning algorithms for social media bot detection[C]//2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS). IEEE, 2021: 1–5.

[24] Hays C, Schutzman Z, Raghavan M, et al. Simplistic collection and labeling practices limit the utility of benchmark datasets for Twitter bot detection[C]//Proceedings of the ACM web conference 2023. 2023: 3660–3669.

[25] Benabbou F, Boukhouima H, Sael N. Fake accounts detection system based on bidirectional gated recurrent unit neural network[J]. International Journal of Electrical and Computer Engineering (IJECE), 2022, 12(3): 3129–3137.

[26] Loyola-González O, Monroy R, Rodríguez J, et al. Contrast pattern-based classification for bot detection on Twitter[J]. IEEE Access, 2019, 7: 45800–45817.

[27] Yang K C, Varol O, Hui P M, et al. Scalable and generalizable social bot detection through data selection[C]//Proceedings of The AAAI Conference on Artificial Intelligence, 2020: 1096–1103.

[28] Kouvela M, Dimitriadis I, Vakali A. Bot-Detective: An explainable Twitter bot detection service with crowdsourcing functionalities[C]//Proceedings of the 12th International Conference on Management of Digital EcoSystems, 2020: 55–63.

[29] Sayyadiharikandeh M, Varol O, Yang K C, et al. Detection of novel social bots by ensembles of specialized classifiers[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2020: 2725–2732.

[30] Moghaddam S H, Abbaspour M. Friendship Preference: Scalable and Robust Category of Features for Social Bot Detection[J]. IEEE Transactions on Dependable and Secure Computing, 2022, 20(2): 1516–1528.

[31] Cao Q, Sirivianos M, Yang X, et al. Aiding the detection of fake accounts in large scale social online services[C]//9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12), 2012: 197–210.

[32] Ma W, Hu S Z, Dai Q, et al. Sybil-resist: A new protocol for sybil attack defense in social network[C]//Applications and Techniques in Information Security: 5th International Conference, ATIS 2014, Melbourne, VIC, Australia, November 26–28, 2014. Proceedings 5. Springer Berlin Heidelberg, 2014: 219–230.

[33] Cai Z, Tan Z, Lei Z, et al. LMbot: distilling graph knowledge into language model for graph-less deployment in twitter bot detection[C]//Proceedings of the 17th ACM International Conference on Web Search and Data Mining. 2024: 57–66.

[34] Ali M, Hassan M, Kifayat K, et al. Social media content classification and community detection using deep learning and graph analytics[J]. Technological Forecasting and Social Change, 2023, 188: 122252–122263.

[35] Viswanath B, Post A, Gummadi K P, et al. An analysis of social network-based sybil defenses[J]. ACM SIGCOMM Computer Communication Review, 2010, 40(4): 363–374.

[36] Pham P, Nguyen L T T, Vo B, et al. Bot2Vec: A general approach of intra-community oriented representation learning for bot detection in different types of social networks[J]. Information Systems, 2022, 103: 101771–101786.

[37] Dehghan A, Siuta K, Skorupka A, et al. Detecting bots in social-networks using node and structural embeddings[J]. Journal of Big Data, 2023, 10(1): 119–156.

[38] Kudugunta S, Ferrara E. Deep neural networks for bot detection[J]. Information Sciences, 2018, 467: 312–322.

[39] Hayawi K, Mathew S, Venugopal N, et al. DeeProBot: a hybrid deep neural network model for social bot detection based on user profile data[J]. Social Network Analysis and Mining, 2022, 12(1): 1–43.

[40] Wu Y, Fang Y, Shang S, et al. A novel framework for detecting social bots with deep neural networks and active learning[J]. Knowledge-Based Systems, 2020, 211: 106525–106542.

[41] Najari S, Salehi M, Farahbakhsh R. GANBOT: a GAN-based framework for social bot detection[J]. Social Network Analysis and Mining, 2022, 12: 1–11.

[42] Ali Alhosseini S, Bin Tareaf R, Najafi P, et al. Detect me if you can: Spam bot detection using inductive representation learning[C]//Companion Proceedings of the 2019 World Wide Web Conference, 2019: 148–153.

[43] Guo Q, Xie H, Li Y, et al. Social bots detection via fusing bert and graph convolutional networks[J]. Symmetry, 2021, 14(1): 1–30.

[44] Feng S, Wan H, Wang N, et al. BotRGCN: Twitter bot detection with relational graph convolutional networks[C]//Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis And Mining, 2021: 236–239.

[45] Li S, Zhao C, Li Q, et al. BotFinder: a novel framework for social bots detection in online social networks based on graph embedding and community detection[J]. World Wide Web, 2023, 26(4): 1793–1809.

[46] Wu Z, Pan S, Chen F, et al. A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(1): 4–24.

[47] Yang Y, Yang R, Li Y, et al. Rosgas: Adaptive social bot detection with reinforced self-supervised gnn architecture search[J]. ACM Transactions on the Web, 2023, 17(3): 1–31.

[48] Bian T, Xiao X, Xu T, et al. Rumor detection on social media with bi-directional graph convolutional networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(01): 549–556.

[49] Yates V A, Vardaman J M, Chrisman J J. Social network research in the family business literature: A review and integration[J]. Small Business Economics, 2023, 60(4): 1323–1345.

[50] Murayama T, Wakamiya S, Aramaki E, et al. Modeling the spread of fake news on Twitter[J]. Plos One, 2021, 16(4): 1–16.

[51] Khan T, Michalas A, Akhunzada A. Fake news outbreak 2021: Can we stop the viral spread?[J]. Journal of Network and Computer Applications, 2021, 190: 103112–103145.

[52] Aktayeva A, Makatov Y, Tulegenovna A K, et al. Cybersecurity Risk Assessments within Critical Infrastructure Social Networks[J]. Data, 2023, 8(10): 1–18.

[53] 李丹珉, 谢耘耕. 政治传播视角下社交机器人的研究现状及发展趋势——基于SCI和SSCI文献的计量分析[J]. 新媒体与社会, 2023, 2: 140–156.

[54] Mbona I, Eloff J H P. Classifying social media bots as malicious or benign using semi-supervised machine learning[J]. Journal of Cybersecurity, 2023, 9(1): tyac015–tyac026.

[55] Stieglitz S, Brachten F, Ross B, et al. Do social bots dream of electric sheep? A categorisation of social media bot accounts[C]//In Proceedings of the 17th Australasian Conf. Information Systems, 2017: 1–17.

[56] Yang C, Harkreader R, Gu G. Empirical evaluation and new design for fighting evolving twitter spammers[J]. IEEE Transactions on Information Forensics and Security, 2013, 8(8): 1280–1293.

[57] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, 1: 4171–4186.

[58] Liu Y, Ott M, Goyal N, et al. Roberta: A robustly optimized bert pretraining approach [C]//Proceedings of International Conference on Learning Representations, 2019:1–15.

[59] Feng F, Yang Y, Cer D, et al. Language-agnostic BERT sentence embedding[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022: 878–891.

[60] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述[J]. 计算机学报, 2017, 40(6): 1229–1251.

[61] Scarselli F, Gori M, Tsoi A C, et al. The graph neural network model[J]. IEEE Transactions on Neural Networks, 2008, 20(1): 61–80.

[62] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks[C]//Proceedings of International Conference on Learning Representations, 2017: 1–14.

[63] Veličković P, Cucurull G, Casanova A, et al. Graph attention networks[C]//Proceedings of International Conference on Learning Representations, 2017: 1–12.

[64] Jin D, Huo C, Liang C, et al. Heterogeneous graph neural network via attribute completion[C]//Proceedings of the Web Conference, 2021: 391–400.

[65] Dong Y, Chawla N V, Swami A. metapath2vec: Scalable representation learning for heterogeneous networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017: 135–144.

[66] Schlichtkrull M, Kipf T N, Bloem P, et al. Modeling relational data with graph convolutional networks[C]//The Semantic Web: 15th International Conference, ESWC 2018. Springer International Publishing, 2018: 593–607.

[67] Hu Z, Dong Y, Wang K, et al. Heterogeneous graph transformer[C]//Proceedings of The Web Conference 2020, 2020: 2704–2710.

[68] Mao A, Mohri M, Zhong Y. Cross-entropy loss functions: Theoretical analysis and applications[C]//International Conference on Machine Learning. PMLR, 2023: 23803–23828.

[69] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of The IEEE International Conference on Computer Vision, 2017: 2980–2988.

[70] Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein gans[J]. Advances in Neural Information Processing Systems, 2017, 30: 1–11.

[71] Lan Z, Chen M, Goodman S, et al. Albert: A lite bert for self-supervised learning of language representations[C]//Proceedings of International Conference on Learning Representations, 2019: 1–17.

[72] Cresci S, Di Pietro R, Petrocchi M, et al. Social fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling[J]. IEEE Transactions on Dependable and Secure Computing, 2017, 15(4): 561–576.

[73] Brody S, Alon U, Yahav E. How attentive are graph attention networks? [C]//Proceedings of International Conference on Learning Representations, 2022: 1–26.

[74] Brockschmidt M. Gnn-film: Graph neural networks with feature-wise linear modulation[C]//International Conference on Machine Learning. PMLR, 2020: 1144–1152.

[75] Feng S, Wan H, Wang N, et al. Twibot-20: A comprehensive twitter bot detection benchmark[C]//Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021: 4485–4494.

[76] Cresci S, Di Pietro R, Petrocchi M, et al. Fame for sale: Efficient detection of fake twitter followers[J]. Decision Support Systems, 2015, 80: 56–71.

[77] Fkih F, Omri M N. Estimation of a priori decision threshold for collocations extraction: an empirical study[J]. International Journal of Information Technology and Web Engineering (IJITWE), 2013, 8(3): 34–49.

[78] Song H Y, Park S. An analysis of correlation between personality and visiting place using Spearman's rank correlation coefficient[J]. KSII Transactions on Internet and Information Systems (TIIS), 2020, 14(5): 1951–1966.

[79] Heimerl F, Lohmann S, Lange S, et al. Word cloud explorer: Text analytics based on word clouds[C]//2014 47th Hawaii International Conference on System Sciences. IEEE, 2014: 1833–1842.

[80] Yang Z, Yang D, Dyer C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of The North American Chapter of The Association for Computational Linguistics: Human Language Technologies, 2016: 1480–1489.

[81] Shi S, Qiao K, Chen J, et al. Mgtab: A multi-relational graph-based twitter account detection benchmark[J/OL]. Arxiv Preprint, 2023, arXiv:2301.01123.

[82] Feng S, Tan Z, Wan H, et al. TwiBot-22: Towards graph-based Twitter bot detection[J]. Advances in Neural Information Processing Systems, 2022, 35: 35254–35269.

[83] Liu Y, Tan Z, Wang H, et al. Botmoe: Twitter bot detection with community-aware mixtures of modal-specific experts[C]//Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023: 485–495.

[84] Feng S, Tan Z, Li R, et al. Heterogeneity-aware twitter bot detection with relational graph transformers[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(4): 3977–3985.

中图分类号:

 TP391.9    

开放日期:

 2025-06-18    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式