论文中文题名: | 融合语义信息的方面级文本情感分析研究 |
姓名: | |
学号: | 21208223060 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 085400 |
学科名称: | 工学 - 电子信息 |
学生类型: | 硕士 |
学位级别: | 工学硕士 |
学位年度: | 2024 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 自然语言处理 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2024-06-17 |
论文答辩日期: | 2024-05-30 |
论文外文题名: | Research on Aspect-based Text Sentiment Analysis Integrating Semantic Information |
论文中文关键词: | |
论文外文关键词: | online comments ; attention mechanism ; implicit sentiment analysis ; prompt learning ; large language models |
论文中文摘要: |
文本情感分析是自然语言处理领域的关键研究方向之一,它在产品反馈分析、公共舆论监控、个性化推荐服务等多个应用场景中发挥着重要作用。随着社交媒体的迅速渗透,网络上的用户评论数据量呈现出爆炸性增长,这些评论中蕴含的丰富情感信息对于消费者了解产品特性、企业优化产品与服务、以及政府机构提升政务服务质量具有不可估量的价值。因此,深入挖掘在线媒体评论中的情感信息,对于学术研究和实际应用均具有极其重要的意义。然而现有的文本情感分析模型仍然存在着语义获取不充分等问题,同时面对复杂多变的网络信息,大量隐式情感表达给文本情感分析模型在性能表现上带来了挑战。鉴于这些问题,本研究选择文本评论内容作为研究对象,对不同类型的评论文本进行了细致的分析和研究,旨在提高情感分析模型的准确性。 (1)针对现有的情感分类模型在局部建模时邻域标记不恰当,以及在上下文建模时没有充分表征上下文信息等问题,本文提出了一种基于BERT的融合局部和上下文信息的方面情感分析模型(A Local and Context Fusion Sentiment Analysis Model Based on BERT, LCA-BERT)。具体地,使用动态掩码矩阵替代BERT模型中自注意力网络中的静态掩码矩阵,使模型能够更有效地获取局部信息;其次,引入了“准”注意力计算和深度全局上下文的方法,这种方法不仅降低了文本噪声对模型的影响,而且充分获取了上下文语义信息。经过与基线模型的对比实验表明,本文模型在方面情感分析任务上的准确率和AUC分别达到了94.5%和97.8%,验证了本文模型的先进性。 (2)针对情感语句中缺乏明确情感特征,模型无法准确分析情感极性等问题,本文提出了一种融合大语言模型的三级联合提示隐式情感分析方法(A Three-Level Joint Prompt-tuning Sentiment Analysis Method Incorporating LLMs, TPISA),将大语言模型与本地预训练模型相结合,使用多级推理的方式逐级得出目标的方面、潜在观点,使模型能够更轻松地推理出最终的情感极性。前两级提示利用大型语言模型丰富的世界知识,丰富了情感语句的情感信息;然后,将前两级提示得到的方面和潜在意见与上下文连接起来,作为第三级提示的输入,使预训练的模型能够从标签词汇中获得丰富的语义知识,增强模型的学习能力。实验证明,本文提出的模型在SemEval14 Laptop和Restaurant数据集上比基线模型的性能表现更好,准确率分别达到了77.93%和82.31%,证明了本文模型在隐式情感分析任务上的先进性,为情感分析领域的后续研究提供了有益的参考。 |
论文外文摘要: |
Textual sentiment analysis, as a pivotal research direction within the realm of natural language processing, has been extensively applied across various application scenarios, including product analysis, public sentiment monitoring, and personalized services. With the rapid proliferation of social media, there has been a marked increase in user-generated comment data on online platforms. The affective content embedded within these textual reviews is instrumental in aiding users in their understanding of products, facilitating enterprises and platforms in refining product quality, and enabling governments to enhance the caliber of public services. Consequently, the excavation of affective information from online media reviews holds significant research implications. However, prevailing models of textual sentiment analysis are not without their shortcomings, as they often fall short in the comprehensive acquisition of semantics. Moreover, the intricate and mutable nature of online information presents a formidable challenge, with a plethora of implicit emotional expressions adversely affecting the performance of sentiment analysis models. In light of these issues, the present study focuses on online textual reviews as its subject of investigation, delving into different types of review texts to address these challenges and refine the analytical models accordingly. (1) The current sentiment classification models grapple with the challenge of inadequate neighborhood labeling in local modeling and a shortfall in capturing the full spectrum of contextual information during contextual modeling. In response to these shortcomings, this paper introduces an innovative sentiment analysis model that harnesses the power of BERT for the fusion of local and contextual semantic information, termed A Local and Context Fusion Sentiment Analysis Model Based on BERT (LCA-BERT). Specifically, this study replaces the static mask matrix in BERT's self-attention network with a dynamic mask matrix, enabling the model to more effectively capture local details. Furthermore, the introduction of "quasi" attention weights and a method for deep global context reduces the impact of textual noise on the model while fully acquiring contextual semantic information. Comparative experiments with baseline models demonstrate that the proposed model achieves an accuracy of 94.5% and an AUC of 97.8% in aspect-based sentiment analysis tasks, thereby validating the model's sophistication and efficacy. (2) In response to the deficiency of explicit sentiment lexicon within sentiment expressions and the models' consequent struggle to accurately discern sentiment polarity, this paper proposes a Three-Level Joint Prompt-tuning Sentiment Analysis Method Incorporating Large Language Models (TPISA). This method integrates a large language models(LLMs) with a locally pre-trained model, employing a multi-level inference approach to progressively deduce the target aspect and latent opinion, thereby enabling the model to more readily infer the ultimate sentiment polarity. The first two levels of prompt leverage the extensive world knowledge of large language models to enrich the emotional information within sentiment phrases. Subsequently, the aspects and latent opinions derived from the first two levels prompt are connected with the context, serving as input for the third level prompt, enabling the pre-trained model to acquire rich semantic knowledge from the labeled vocabulary and enhance its learning capabilities. Experiments have confirmed that the model proposed in this paper outperforms baseline models on the SemEval-2014 Laptop and Restaurant datasets, with accuracy rates of 77.93% and 82.31%, respectively. This proves the progressiveness of the model in implicit emotion analysis, and offers valuable insights for research in the field of implicit sentiment analysis. |
参考文献: |
[3] DAVE CHAFFEY. Global social media statistics research summary 2022[EB]//Smart Insights. 2023. [8] 周咏梅, 杨佳能, 阳爱民. 面向文本情感分析的中文情感词典构建方法[J]. 《山东大学学报(工学版)》, 2013: 27-33. [10] 陈强, 何炎祥. 基于句法分析的跨语言情感分析[J]. 北京大学学报(自然科学版), 2014, 50(1). [12] 解军, 邢进生. 基于KNN算法的新浪微博用户行为分析及预测[J]. 《山西师范大学学报(自然科学版)》, 2016: 38-45. [58] 黄山成, 韩东红, 乔百友, 等. 基于ERNIE2.0-BiLSTM-Attention的隐式情感分析方法[J]. 小型微型计算机系统, 2021, 42(12): 12-2485. [61] 张心月, 刘蓉*, 魏驰宇, 方可. 融合提示知识的方面级情感分析方法[J]. 计算机应用, 2023, 43(9): 2753-2759. [72] 王昱婷, 刘一伊, 张儒清, 等. 基于提示学习的文本隐式情感分类[J/OL]. 山西大学学报(自然科学版), 2023, 46(3): 509-517. |
中图分类号: | TP391 |
开放日期: | 2024-06-18 |