【24h】

Invited Talk: Evaluating Natural Language Generation Systems

机译:特邀演讲:评估自然语言生成系统

获取原文

摘要

Natural Language Generation (NLG) systems have different characteristics than other NLP systems, which effects how they are evaluated. In particular, it can be difficult to meaningfully evaluate NLG texts by comparing them against gold- standard reference texts, because (A) there are usually many possible texts which are acceptable to users and (B) some NLG systems produce texts which are better (as judged by human users) than human-written corpus texts. Partially because of these reasons, the NLG community places much more emphasis on human-based evaluations than most areas of NLP. I will discuss the various ways in which NLG systems are evaluated, focusing on human-based evaluations. These typically either measure the success of generated texts at achieving a goal (eg, measuring how many people change their behaviour after reading behaviour-change texts produced by an NLG system); or ask human subjects to rate various aspects of generated texts (such as readability, accuracy, and appropriateness), often on Likert scales. I will use examples from evaluations I have carried out, and highlight some of the lessons I have learnt, including the importance of reporting negative results, the difference between laboratory and real-world evaluations, and the need to look at worse-case as well as average-case performance. I hope my talk will be interesting and relevant to anyone who is interested in the evaluation of NLP systems.
机译:自然语言生成(NLG)系统与其他NLP系统具有不同的特征,这影响了它们的评估方式。尤其是,很难通过将它们与黄金标准参考文本进行比较来有意义地评估NLG文本,因为(A)通常有许多可能的文本被用户接受,并且(B)一些NLG系统生成的文本更好一些(由人类使用者判断),而不是人类书面语料。部分由于这些原因,与NLP的大多数领域相比,NLG社区更加重视基于人的评估。我将讨论以人为基础的评估方式对NLG系统进行评估的各种方式。通常,这些方法要么衡量生成的文本在实现目标方面的成功率(例如,衡量多少人在阅读NLG系统生成的更改行为的文本后改变其行为);或要求人类受试者经常按李克特量表对所生成文本的各个方面(例如可读性,准确性和适当性)进行评分。我将使用我已进行的评估中的示例,并重点介绍我所学到的一些经验教训,包括报告负面结果的重要性,实验室评估与实际评估之间的差异以及需要考虑更坏情况的情况。作为平均情况下的表现。我希望我的演讲对所有对NLP系统评估感兴趣的人都能够引起兴趣并引起他们的兴趣。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号