首页> 外文期刊>Information Processing & Management >Karci summarization: A simple and effective approach for automatic text summarization using Karci entropy
【24h】

Karci summarization: A simple and effective approach for automatic text summarization using Karci entropy

机译:Karci摘要:使用Karci熵进行文本自动摘要的一种简单有效的方法

获取原文
获取原文并翻译 | 示例
           

摘要

Increases in the amount of text resources available via the Internet has amplified the need for automated document summarizing tools. However, further efforts are needed in order to improve the quality of the existing summarization tools currently available. The current study proposes Karci Summarization, a novel methodology for extractive, generic summarization of text documents. Karci Entropy was used for the first time in a document summarization method within a unique approach. An important feature of the proposed system is that it does not require any kind of information source or training data. At the stage of presenting the input text, a tool for text processing was introduced; known as RUSH (named after its authors; Karci, Uckan, Seyyarer, and Hark), and is used to protect semantic consistency between sentences. The Karci Entropy-based solution chooses the most effective, generic and most informational sentences within a paragraph or unit of text. Experimentation with the Karci Summarization approach was tested using open-access document text (Document Understanding Conference; DUC-2002, DUC-2004) datasets. Performance achievement of the Karci Summarization approach was calculated using metrics known as Recall-Oriented Understudy for Gisting Evaluation (ROUGE). The experimental results showed that the proposed summarizer outperformed all current state-of-the-art methods in terms of 200-word summaries in the metrics of ROUGE-1, ROUGE-2, ROUGE-L, and ROUGE-W-1.2. In addition, the proposed summarizer outperformed the nearest competitive summarizers by a factor of 6.4% for ROUGE-1 Recall on the DUC-2002 dataset. These results demonstrate that Karci Summarization is a promising technique and it is therefore expected to attract interest from researchers in the field. Our approach was shown to have a high potential for adoptability. Moreover, the method was assessed as quite insensitive to disorderly and missing texts due to its RUSH text processing module.
机译:可以通过Internet获得的文本资源数量的增加,扩大了对自动文档摘要工具的需求。但是,需要进一步的努力来提高当前可用的现有汇总工具的质量。当前的研究提出了Karci摘要,这是一种用于文本文件的提取,通用摘要的新颖方法。在独特的方法中,Karci Entropy首次用于文档汇总方法中。所提出的系统的重要特征是它不需要任何类型的信息源或训练数据。在呈现输入文本的阶段,引入了文本处理工具。称为RUSH(以其作者命名; Karci,Uckan,Seyyarer和Hark命名),用于保护句子之间的语义一致性。基于Karci熵的解决方案选择一段或一段文本内最有效,最通用和最有用的句子。使用开放存取的文档文本(文档理解会议; DUC-2002,DUC-2004)数据集测试了Karci摘要方法的实验。 Karci汇总方法的性能成就是使用称为面向召回评估的粗化评估基础研究(ROUGE)的指标来计算的。实验结果表明,在ROUGE-1,ROUGE-2,ROUGE-L和ROUGE-W-1.2的度量标准方面,该摘要器在200个单词的摘要方面优于所有当前的最新方法。此外,对于DUC-2002数据集上的ROUGE-1 Recall,拟议的汇总器比最接近的竞争对手的汇总器高出6.4%。这些结果表明,Karci摘要是一种很有前途的技术,因此有望吸引该领域研究人员的兴趣。我们的方法被证明具有很高的采用潜力。此外,由于其RUSH文本处理模块,该方法被认为对无序和缺少文本不敏感。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号