【24h】

Cross-Document Summarization by Concept Classification

机译:概念分类的交叉文件摘要

获取原文

摘要

In this paper we describe a Cross Document Summarizer XDoX designed specifically to summarize large document sets (50-500 documents and more). Such sets of documents are typically obtained from routing or filtering systems run against a continuous stream of data, such as a newswire. XDoX works by identifying the most salient themes within the set (at the granularity level that is regulated by the user) and composing an extraction summary, which reflects these main themes. In the current version, XDoX is not optimized to produce a summary based on a few unrelated documents; indeed, such summaries are best obtained simply by concatenating summaries of individual documents. We show examples of summaries obtained in our tests as well as from our participation in the first Document Understanding Conference (DUC).
机译:在本文中,我们描述了一个跨文件摘要XDOX,专门设计用于总结大型文件集(50-500文档等)。这些文件集通常是从路由或过滤系统获得的,用于针对连续的数据流,例如新闻。 XDOX通过识别集中的最大主题(在用户调节的粒度级别)并构成提取摘要,这反映了这些主题。在当前版本中,XDox未得到优化,以基于几个不相关的文件生成摘要;实际上,通过串联各个文件的摘要,最好地获得这些摘要。我们展示了我们测试中获得的摘要的例子,以及我们参与第一个文件理解会议(DUC)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号