首页> 外文期刊>Information Processing & Management >Text summarization using Wikipedia
【24h】

Text summarization using Wikipedia

机译:使用Wikipedia进行文本摘要

获取原文
获取原文并翻译 | 示例
           

摘要

Automatic text summarization has been an active field of research for many years. Several approaches have been proposed, ranging from simple position and word-frequency methods, to learning and graph based algorithms. The advent of human-generated knowledge bases like Wikipedia offer a further possibility in text summarization - they can be used to understand the input text in terms of salient concepts from the knowledge base. In this paper, we study a novel approach that leverages Wikipedia in conjunction with graph-based ranking. Our approach is to first construct a bipartite sentence-concept graph, and then rank the input sentences using iterative updates on this graph. We consider several models for the bipartite graph, and derive convergence properties under each model. Then, we take up personalized and query-focused summarization, where the sentence ranks additionally depend on user interests and queries, respectively. Finally, we present a Wikipedia-based multi-document summarization algorithm. An important feature of the proposed algorithms is that they enable real-time incremental summarization - users can first view an initial summary, and then request additional content if interested. We evaluate the performance of our proposed summarizer using the ROUGE metric, and the results show that leveraging Wikipedia can significantly improve summary quality. We also present results from a user study, which suggests that using incremental summarization can help in better understanding news articles.
机译:自动文本摘要多年来一直是研究的活跃领域。已经提出了几种方法,从简单的位置和词频方法到基于学习和基于图的算法。诸如Wikipedia之类的人为生成的知识库的出现为文本摘要提供了另一种可能性-它们可用于根据知识库中的重要概念来理解输入文本。在本文中,我们研究了一种新颖的方法,该方法结合了Wikipedia和基于图的排名。我们的方法是先构造一个二分句子概念图,然后使用该图上的迭代更新对输入句子进行排名。我们考虑二部图的几个模型,并在每个模型下得出收敛性。然后,我们进行个性化和以查询为重点的摘要,其中句子的排名分别分别取决于用户的兴趣和查询。最后,我们提出了一种基于维基百科的多文档摘要算法。所提出算法的一个重要特征是它们可以实现实时增量汇总-用户可以先查看初始摘要,然后在感兴趣时请求其他内容。我们使用ROUGE指标评估了建议的汇总器的性能,结果表明利用Wikipedia可以显着提高汇总质量。我们还提供了一项用户研究的结果,该研究表明使用增量汇总可以帮助更好地理解新闻文章。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号