- 无标题文档
查看论文信息

论文中文题名:

 基于深度学习的短文自动摘要生成算法研究    

姓名:

 贾星宇    

学号:

 G2015119    

学科代码:

 085211    

学科名称:

 计算机技术    

学生类型:

 工程硕士    

学位年度:

 2019    

院系:

 计算机科学与技术学院    

专业:

 计算机技术    

第一导师姓名:

 厍向阳    

第二导师姓名:

 刘韩勇    

论文外文题名:

 Research on Automatic Summary Generation of Short Text Based on Deep Learning    

论文中文关键词:

 生成式摘要 ; 深度学习 ; ROUGE自动评测    

论文外文关键词:

 generated summary ; deep learning ; ROUGE automatic evaluation    

论文中文摘要:
本论文的主要工作分为以下三个部分: (1)采用seq2seq+attention(sequence to sequence with attention)生成摘要。seq2seq+attention采用编码和解码方式,首先对文本内容进行学习,增添attention注意力向量作为中间语义向量加入解码部分的参数,共同决定解码模块中某时刻的生成词。该模型主要由以下两部分构成:编码语言模型对输入序列进行编码,解码语言模型进行解码;在解码的每一个时刻动态生成中间语义向量C,t时刻生成词语由t-1时刻的输出词结合当前时刻t产生的中间语义向量C共同决定该时刻词的生成。 (2)seq2seq+attention模型的优化。对该模型进行改进,联合注意力向量使用修正概率和覆盖机制,解决了大部分在摘要生成中出现的重复问题和未登陆词OOV(out of vocabulary)现象。 (3)实验部分采用ROUGE自动评测和人工评测两种方式进行生成摘要评测,实验结果显示,本文提出的生成式摘要算法在ROUGE-1、ROUGE-2值和人工评测方式上均相对高于传统的抽取式摘要评测值。实验结果表明,基于seq2seq+attention改进的生成式摘要在文档摘要的完整性、连贯性均有很大程度的提升。
论文外文摘要:
The main work of this paper is divided into three parts: (1) Use the fundamental called Seq2seq + attention (sequence to sequence with attention) to generate abstracts. Seq2seq + attention uses encoding and decoding methods. Firstly, it learns the text content, adds attention attention vector as an intermediate semantic vector to the decoding part, and jointly decides the generation words at a certain time in the decoding module. The model consists of two parts: the coding language model encodes the input sequence and decodes the decoding language model; the intermediate semantic vector C is generated dynamically at each time of decoding, and the generated words at T-1 are determined by the output words at T-1 and the intermediate semantic vector C generated at the current time. (2) Optimization of seq2seq + attention model. The model is improved by using modified probability and coverage mechanism to solve most of the duplication problems and the phenomenon of OOV (out of vocabulary) in abstract generation. (3) In the experimental part, two methods, ROUGE automatic evaluation and manual evaluation, are used to generate summary evaluation. The experimental results show that the generated summary algorithm proposed in this paper is relatively higher than the traditional extracted summary evaluation in ROUGE-1, ROUGE-2 and manual evaluation. The experimental results show that the improved generative summary based on seq2seq + attention improves the integrity and coherence of the document summary to a great extent.
中图分类号:

 TP301.6    

开放日期:

 2019-06-10    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式