首页> 外文会议>International Conference on speech and computer >Gaze, Prosody and Semantics: Relevance of Various Multimodal Signals to Addressee Detection in Human-Human- Computer Conversations
【24h】

Gaze, Prosody and Semantics: Relevance of Various Multimodal Signals to Addressee Detection in Human-Human- Computer Conversations

机译:凝视,韵律和语义:人机对话中各种多模式信号与收件人检测的相关性

获取原文

摘要

The present research is focused on multimodal addressee detection in human-human-computer conversations. A modern spoken dialogue system operating under realistic conditions that may include multiparty interaction (several people solve a cooperative task by addressing the system while talking to each other) is supposed to distinguish machine- from human-addressed utterances. Machine-addressed queries should be directly responded to, while human-addressed utterances should be either ignored or processed in an implicit way. We propose a multimodal system performing the visual, acoustic-prosodic, and textual analysis of users' utterances. We managed to outperform the existing baseline for the Smart Video Corpus by applying our system. We also investigated the performance of different models for separate speech categories with various speech spontaneity and determined that the acoustical model has difficulties in classifying constrained speech, and the textual model performs worse for spontaneous speech, while the performance of the visual model drops for read human-addressed speech and for spontaneous human-addressed speech significantly due to the ambiguous behaviour of users.
机译:本研究的重点是在人机对话中的多模式收件人检测。在现实条件下运行的现代口语对话系统可能包括多方互动(几个人在互相交谈的同时通过寻址该系统来解决合作任务)被认为可以将机器语音和人类语音分开。机器寻址的查询应直接响应,而人类寻址的话语则应被忽略或以隐式方式处理。我们提出了一种多模式系统,可以对用户的话语进行视觉,声学,韵律和文本分析。通过应用我们的系统,我们设法超越了智能视频语料库的现有基准。我们还研究了具有不同语音自发性的不同语音类别的不同模型的性能,并确定了声学模型难以对受限语音进行分类,而文本模型对自发语音的性能较差,而视觉模型的性能却对阅读者有所下降由于用户的模棱两可的行为,导致语音寻址和自发的人类语音产生很大影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号