...
首页> 外文期刊>OASIcs : OpenAccess Series in Informatics >Acquiring Domain-Specific Knowledge for WordNet from a Terminological Database
【24h】

Acquiring Domain-Specific Knowledge for WordNet from a Terminological Database

机译:从术语数据库获取WordNet的领域特定知识

获取原文
           

摘要

In this research we explore a terminological database (Termoteca) in order to expand the Portuguese and Galician wordnets (PULO and Galnet) with the addition of new synset variants (word forms for a concept), usage examples for the variants, and synset glosses or definitions. The methodology applied in this experiment is based on the alignment between concepts of WordNet (synsets) and concepts described in Termoteca (terminological records), taking into account the lexical forms in both resources, their morphological category and their knowledge domains, using the information provided by the WordNet Domains Hierarchy and the Termoteca field domains to reduce the incidence of polysemy and homography in the results of the experiment. The results obtained confirm our hypothesis that the combined use of the semantic domain information included in both resources makes it possible to minimise the problem of lexical ambiguity and to obtain a very acceptable index of precision in terminological information extraction tasks, attaining a precision above 89% when there are two or more different languages sharing at least one lexical form between the synset in Galnet and the Termoteca record.
机译:在这项研究中,我们探索了一个术语数据库(Termoteca),以扩展葡萄牙语和加利西亚语的词网(PULO和Galnet),并增加了新的同义词集变体(概念的单词形式),变量的用法示例以及同义词集修饰语或定义。本实验中使用的方法是基于WordNet(同义词)概念和Termoteca(术语记录)中描述的概念之间的一致性,并使用提供的信息考虑了资源,其形态类别和知识领域中的词汇形式通过WordNet Domains Hierarchy和Termoteca领域域来减少多态性和单应性的发生率。获得的结果证实了我们的假设,即两种资源中包含的语义域信息的组合使用可以最大程度地减少词汇歧义问题,并在术语信息提取任务中获得非常可接受的精度指标,达到89%以上的精度当Galnet中的同义词集和Termoteca记录之间有两种或多种不同的语言共享至少一种词汇形式时。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号