首页> 外文会议>ICCEE 2010;International conference on computer and electrical engineering >A Novel Thematic Term Extraction Method from Chinese Document
【24h】

A Novel Thematic Term Extraction Method from Chinese Document

机译:一种新的中文文献主题词提取方法

获取原文

摘要

Thematic terms can well represent the main idea of documents. The research on thematic term extraction is one of important fields of Natural Language Processing. This paper proposes a novel thematic term extraction method, which consists of the generation of candidate thematic term set based on the position weight of terms and the extraction of thematic term based on incremental weight of thematic term set. The generation algorithm gives a weight to a term according to its positions in a document, and then generates the candidate thematic term set according to their weights. The extraction algorithm calculates the incremental weight of each candidate term, and selects the terms whose incremental weights are larger than a given threshold. The experiment results on two corpuses show that the overall satisfaction of thematic term extraction of our method is beyond 90%, achieving very good performance.
机译:主题术语可以很好地代表文档的主要思想。主题词抽取的研究是自然语言处理的重要领域之一。本文提出了一种新颖的主题词提取方法,该方法包括基于词项位置权重的候选主题词集的生成和基于主题词集的增量权重的主题词的提取。生成算法根据术语在文档中的位置为其赋予权重,然后根据其权重生成候选主题词集。提取算法计算每个候选项的增量权重,并选择增量权重大于给定阈值的项。在两个语料库上的实验结果表明,该方法对主题词提取的总体满意度超过90%,取得了很好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号