首页> 外文期刊>Language Resources and Evaluation >SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment
【24h】

SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment

机译:通过SemEval眼镜呼吸。通过语义相关性和文本涵义从完整句子的组成分布语义模型评估中吸取的教训

获取原文
获取原文并翻译 | 示例
           

摘要

This paper is an extended description of SemEval-2014 Task 1, the task on the evaluation of Compositional Distributional Semantics Models on full sentences. Systems participating in the task were presented with pairs of sentences and were evaluated on their ability to predict human judgments on (1) semantic relatedness and (2) entailment. Training and testing data were subsets of the SICK (Sentences Involving Compositional Knowledge) data set. SICK was developed with the aim of providing a proper benchmark to evaluate compositional semantic systems, though task participation was open to systems based on any approach. Taking advantage of the SemEval experience, in this paper we analyze the SICK data set, in order to evaluate the extent to which it meets its design goal and to shed light on the linguistic phenomena that are still challenging for state-of-the-art computational semantic systems. Qualitative and quantitative error analyses show that many systems are quite sensitive to changes in the proportion of sentence pair types, and degrade in the presence of additional lexico-syntactic complexities which do not affect human judgements. More compositional systems seem to perform better when the task proportions are changed, but the effect needs further confirmation.
机译:本文是对SemEval-2014任务1的扩展描述,该任务是对完整句子的成分分布语义模型进行评估的任务。参与该任务的系统以成对的句子呈现,并评估了它们预测人类对(1)语义相关性和(2)蕴涵性的判断的能力。训练和测试数据是SICK(涉及组成知识的句子)数据集的子集。 SICK的开发旨在提供适当的基准来评估组成语义系统,尽管任务参与对基于任何方法的系统都是开放的。利用SemEval的经验,在本文中,我们分析了SICK数据集,以评估其达到设计目标的程度,并阐明仍然对最新技术挑战的语言现象。计算语义系统。定性和定量错误分析表明,许多系统对句子对类型比例的变化非常敏感,并且在不影响人类判断力的附加词汇-句法复杂性的情况下退化。当任务比例改变时,更多的合成系统似乎表现更好,但是效果需要进一步确认。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号