首页> 外文会议>9th International conference on language resources and evaluation >Comparing Similarity Measures for Distributional Thesauri
【24h】

Comparing Similarity Measures for Distributional Thesauri

机译:比较分布词库的相似措施

获取原文

摘要

Distributional thesauri have been applied for a variety of tasks involving semantic relatedness. In this paper, we investigate the impact of three parameters: similarity measures, frequency thresholds and association scores. We focus on the robustness and stability of the resulting thesauri, measuring inter-thesaurus agreement when testing different parameter values. The results obtained show that low-frequency thresholds affect thesaurus quality more than similarity measures, with more agreement found for increasing thresholds. These results indicate the sensitivity of distributional thesauri to frequency. Nonetheless, the observed differences do not transpose over extrinsic evaluation using TOEFL-like questions. While this may be specific to the task, we argue that a careful examination of the stability of distributional resources prior to application is needed.
机译:分布词库已应用于涉及语义相关性的各种任务。在本文中,我们调查了三个参数的影响:相似度测量,频率阈值和关联分数。我们专注于所得词库的稳健性和稳定性,在测试不同参数值时测量叙述阶段协议。得到的结果表明,低频阈值影响比相似度措施更高的素质质量,更有协议越来越多地用于增加阈值。这些结果表明分布叙述到频率的敏感性。尽管如此,观察到的差异不会使用类似托福的问题转移外在评估。虽然这可能是特定的任务,但我们认为需要仔细检查在申请之前分配资源的稳定性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号