首页> 中文期刊> 《计算机应用研究》 >一种基于TextRank的单文本关键字提取算法

一种基于TextRank的单文本关键字提取算法

         

摘要

作为一种经典的文本关键字提取和摘要自动生成算法,TextRank将文本看做若干单词组成的集合,并通过对单词节点图的节点权值进行迭代计算,挖掘单词之间的潜在语义关系.在TextRank节点图模型的基础上,将马尔可夫状态转移模型与节点图相结合,提出节点间边权为条件概率的新模型生成算法TextRank Revised.通过对有标记和无标记的验证集进行验证,证明新的算法在不提升时间复杂度的前提下,通过计算单文本得出的单词排序结果相较于原TextRank算法更加吻合人工对文档的关键字提取结果.%As a classical key-word extracting and abstraction auto-generating algorithm,TextRank considered the text as a group of terms,and sought a latent semantic relationship between terms according to iteratively calculating the weights of the terms in the nodes graph.Based on the nodes graph model of TextRank,combined node graph and Markov state transform model,weighted the edge between nodes with conditional probability,proposed a new nodes graph model and corresponding algorithm TextRank_Revised(TR-R).According to the verification on labeled and unlabeled samples,it shows that without promotion of time complexity,the new algorithm can get a key-word sorting consequence which is closer to the manual than the original algorithm from the single text.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号