首页> 外文期刊>ACM transactions on Asian language information processing >Constructing a WordNet for Turkish Using Manual and Automatic Annotation
【24h】

Constructing a WordNet for Turkish Using Manual and Automatic Annotation

机译:使用手动和自动注释为土耳其语构建WordNet

获取原文
获取原文并翻译 | 示例
           

摘要

In this article, we summarize the methodology and the results of our 2-year-long efforts to construct a comprehensive WordNet for Turkish. In our approach, we mine a dictionary for synonym candidate pairs and manually mark the senses in which the candidates are synonymous. We marked every pair twice by different human annotators. We derive the synsets by finding the connected components of the graph whose edges are synonym senses. We also mined Turkish Wikipedia for hypernym relations among the senses. We analyzed the resulting WordNet to highlight the difficulties brought about by the dictionary construction methods of lexicographers. After splitting the unusually large synsets, we used random walk-based clustering that resulted in a Zipfian distribution of synset sizes. We compared our results to BalkaNet and automatic thesaurus construction methods using variation of information metric. Our Turkish WordNet is available online.
机译:在本文中,我们总结了为期两年的努力为土耳其构建全面的WordNet的方法和结果。在我们的方法中,我们为同义词候选对挖掘字典,并手动标记候选词的同义词。我们用不同的人类注释器对每对标记两次。我们通过找到图的边缘是同义词的连通部分来推导同义词集。我们还挖掘了土耳其语Wikipedia的感官之间的上位关系。我们分析了由此产生的WordNet,以突出词典编纂者的词典构建方法带来的困难。在拆分了异常大的同义词集之后,我们使用了基于随机游动的聚类,从而导致了同义词集大小的Zipfian分布。我们将我们的结果与BalkaNet和使用信息量度变化的自动同义词库构建方法进行了比较。我们的土耳其语WordNet可在线获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号