首页> 外文期刊>Information retrieval >Methods for automatically evaluating answers to complex questions
【24h】

Methods for automatically evaluating answers to complex questions

机译:自动评估复杂问题答案的方法

获取原文
获取原文并翻译 | 示例
           

摘要

Evaluation is a major driving force in advancing the state of the art in language technologies. In particular, methods for automatically assessing the quality of machine output is the preferred method for measuring progress, provided that these metrics have been validated against human judgments. Following recent developments in the automatic evaluation of machine translation and document summarization, we present a similar approach, implemented in a measure called POURPRE, an automatic technique for evaluating answers to complex questions based on n-gram co-occurrences between machine output and a human-generated answer key. Until now, the only way to assess the correctness of answers to such questions involves manual determination of whether an information "nugget" appears in a system's response. The lack of automatic methods for scoring system output is an impediment to progress in the field, which we address with this work. Experiments with the TREC 2003, TREC 2004, and TREC 2005 QA tracks indicate that rankings produced by our metric correlate highly with official rankings, and that POURPRE outperforms direct application of existing metrics.
机译:评估是推动语言技术发展的主要动力。特别是,自动评估机器输出质量的方法是衡量进度的首选方法,前提是这些度量标准已根据人的判断进行了验证。继机器翻译和文档摘要自动评估的最新发展之后,我们提出了一种类似的方法,该方法在称为POURPRE的措施中实施,POURPRE是一种自动技术,用于基于机器输出和人类之间的n-gram共现来评估复杂问题的答案生成的答案键。到目前为止,评估此类问题答案正确性的唯一方法是手动确定信息“块”是否出现在系统的响应中。缺乏对系统输出进行评分的自动方法,阻碍了该领域的进步,我们将通过这项工作来解决这一问题。对TREC 2003,TREC 2004和TREC 2005质量检查轨道的实验表明,我们的指标产生的排名与官方排名高度相关,并且POURPRE优于直接应用现有指标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号