Language Model Transformers as Evaluators for Open-domain Dialogues

机译：语言模型变压器作为开放式对话的评估员

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Computer-based systems for communication with humans are a cornerstone of AI research since the 1950s. So far, the most effective way to assess the quality of the dialogues produced by these systems is to use resource-intensive manual labor instead of automated means. In this work, we investigate whether language models (LM) based on transformer neural networks can indicate the quality of a conversation. In a general sense, language models are methods that learn to predict one or more words based on an already given context. Due to their unsupervised nature, they are candidates for efficient, automatic indication of dialogue quality. We demonstrate that human evaluators have a positive correlation between the output of the language models and scores. We also provide some insights into their behavior and inner-working in a conversational context.

机译：自20世纪50年代以来，基于计算机的通信系统是AI研究的基石。到目前为止，评估这些系统产生的对话质量的最有效的方法是使用资源密集的手工劳动而不是自动化手段。在这项工作中，我们调查了基于变压器神经网络的语言模型（LM）是否可以指示对话的质量。在一般意义上，语言模型是学习基于已经给定的上下文预测一个或多个单词的方法。由于他们无人监督的性质，他们是有效，自动指示对话质量的候选人。我们证明人类评估人员在语言模型和分数的产出之间具有正相关。我们还向其行为提供了一些见解和在会话环境中的内心工作。

著录项

来源
《International Conference on Computational Linguistics》|2020年|6797-6808|共12页
会议地点
作者
Rostislav Nedelchev; Jens Lehmann; Ricardo Usbeck;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. PONE: A Novel Automatic Evaluation Metric for Open-domain Generative Dialogue Systems [J] . Lan Tian, Mao Xian-Ling, Wei Wei, ACM Transactions on Information Systems . 2021,第1期

机译：POE：开放式生成对话系统的新型自动评估度量
2. The Evaluation of Natural Language Understanding Models for Answering Dialogue Questions [J] . O?uzhan Karahan, Ahmet Gürhanl International Journal of Engineering Research and Applications . 2020,第9期

机译：评估自然语言理解模型回答对话题
3. Stochastic Language Generation in Dialogue using Factored Language Models [J] . Fran?ois Mairess, Steve Youn Computational linguistics . 2014,第4期

机译：使用因式语言模型进行对话中的随机语言生成
4. GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems [C] . Lishan Huang, Zheng Ye, Jinghui Qin, Conference on Empirical Methods in Natural Language Processing . 2020

机译：等级：评估开放式对话系统的自动图增强相干度量
5. Modeling of electrical transformers and seismic performance evaluation of high voltage transformer bushings. [D] . Oikonomou, Konstantinos. 2010

机译：电力变压器的建模和高压变压器套管的抗震性能评估。
6. Transformers-sklearn: a toolkit for medical language understanding with transformer-based models [O] . Feihong Yang, Xuwen Wang, Hetong Ma, 2021

机译：变换器 - Sklearn：用基于变压器的模型的医疗语言理解的工具包
7. Natural Language Generation Using Transformer Network in an Open-Domain Setting [O] . Deeksha Varshney, Asif Ekbal, Ganesh Prasad Nagaraja, 2020

机译：使用变压器网络在开放式域设置中的自然语言

Language Model Transformers as Evaluators for Open-domain Dialogues

摘要

著录项

相似文献

相关主题

期刊订阅