...
首页> 外文期刊>Expert Systems >ArA*summarizer: An Arabic text summarization system based on subtopic segmentation and using an A* algorithm for reduction
【24h】

ArA*summarizer: An Arabic text summarization system based on subtopic segmentation and using an A* algorithm for reduction

机译:ArA * summarizer:基于子主题分段并使用A *算法进行归约的阿拉伯文本摘要系统

获取原文
获取原文并翻译 | 示例
           

摘要

Automatic text summarization is a field situated at the intersection of natural language processing and information retrieval. Its main objective is to automatically produce a condensed representative form of documents. This paper presents ArA*summarizer, an automatic system for Arabic single-document summarization. The system is based on an unsupervised hybrid approach that combines statistical, cluster-based, and graph-based techniques. The main idea is to divide text into subtopics then select the most relevant sentences in the most relevant subtopics. The selection process is done by an A* algorithm executed on a graph representing the different lexical-semantic relationships between sentences. Experimentation is conducted on Essex Arabic Summaries Corpus and using recall-oriented understudy for gisting evaluation, automatic summarization engineering, merged model graphs, and n-gram graph powered evaluation via regression evaluation metrics. The evaluation results showed the good performance of our system compared with existing works.
机译:自动文本摘要是位于自然语言处理和信息检索相交处的一个领域。它的主要目的是自动生成简明的代表文件形式。本文介绍了ArA * summarizer,这是一个用于阿拉伯文单文档摘要的自动系统。该系统基于一种无监督的混合方法,该方法结合了统计,基于聚类和基于图的技术。主要思想是将文本分成子主题,然后在最相关的子主题中选择最相关的句子。选择过程是通过在表示句子之间不同的词汇语义关系的图表上执行的A *算法完成的。实验是在Essex阿拉伯语摘要语料库上进行的,并使用面向回忆的基础研究进行了概要评估,自动摘要工程,合并模型图和通过回归评估指标进行n-gram图支持的评估。评估结果表明,与现有工作相比,我们的系统具有良好的性能。

著录项

  • 来源
    《Expert Systems》 |2020年第2期|e12476.1-e12476.19|共19页
  • 作者

  • 作者单位

    Univ Djilali Bounaama Khemis Miliana Dept Math & Comp Sci Rue Thniet El Had Khemis Miliana Wilaya Ain Defla Ain Defla Algeria|Univ Abdelhamid Mehri Constantine 2 Dept Software Technol & Informat Syst Constantine Algeria;

    Res Ctr Sci & Tech Informat Informat Sci R&D Lab Algiers Algeria;

    Univ Abdelhamid Mehri Constantine 2 Dept Software Technol & Informat Syst Constantine Algeria;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    data-driven; graph theory; information extraction; mathematics; method; natural language processing; text analysis; text mining; topic identification;

    机译:数据驱动图论信息提取;数学;方法;自然语言处理;文本分析;文本挖掘主题识别;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号