首页> 外文OA文献 >Term conflation methods in information retrieval: non-linguistic and linguistic approaches
【2h】

Term conflation methods in information retrieval: non-linguistic and linguistic approaches

机译:信息检索中的术语合并方法:非语言和语言方法

摘要

Purpose – To propose a categorization of the different conflation procedures at the two basic approaches, non-linguistic and linguistic techniques, and to justify the application of normalization methods within the framework of linguistic techniques. Design/methodology/approach – Presents a range of term conflation methods, that can be used in information retrieval. The uniterm and multiterm variants can be considered equivalent units for the purposes of automatic indexing. Stemming algorithms, segmentation rules, association measures and clustering techniques are well evaluated non-linguistic methods, and experiments with these techniques show a wide variety of results. Alternatively, the lemmatisation and the use of syntactic pattern-matching, through equivalence relations represented in finite-state transducers (FST), are emerging methods for the recognition and standardization of terms. Findings – The survey attempts to point out the positive and negative effects of the linguistic approach and its potential as a term conflation method. Originality/value – Outlines the importance of FSTs for the normalization of term variants.
机译:目的–根据非语言和语言技术这两种基本方法对不同的合并程序进行分类,并在语言技术的框架内证明规范化方法的应用合理性。设计/方法/方法–提出了一系列术语融合方法,可用于信息检索。对于自动索引,可以将uniterm和multiterm变体视为等效单位。词干算法,分割规则,关联度量和聚类技术是对非语言方法的良好评估,并且使用这些技术进行的实验显示出各种各样的结果。替代地,通过有限状态换能器(FST)中表示的等价关系的去词义化和句法模式匹配的使用,是用于识别和标准化术语的新兴方法。调查结果–调查试图指出语言方法的正面和负面影响及其作为术语合并方法的潜力。原创性/价值–概述了FST对术语变体标准化的重要性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号