首页> 外文会议>2011 IEEE International Conference on Acoustics, Speech and Signal Processing >Online detection of vocal Listener Responses with maximum latency constraints
【24h】

Online detection of vocal Listener Responses with maximum latency constraints

机译:在线检测具有最大延迟限制的声音听众响应

获取原文

摘要

When human listeners utter Listener Responses (e.g. back-channels or acknowledgments) such as ‘yeah’ and ‘mmhmm’, interlocutors commonly continue to speak or resume their speech even before the listener has finished his/her response. This type of speech interactivity results in frequent speech overlap which is common in human-human conversation. To allow for this type of speech interactivity to occur between humans and spoken dialog systems, which will result in more human-like continuous and smoother human-machine interaction, we propose an on-line classifier which can classify incoming speech as Listener Responses. We show that it is possible to detect vocal Listener Responses using maximum latency thresholds of 100–500 ms, thereby obtaining equal error rates ranging from 34% to 28% by using an energy based voice activity detector.
机译:当人类听众说出“是”和“嗯”之类的听众响应(例如反向通道或确认)时,对话者通常甚至会在听众完成其响应之前继续讲话或恢复其讲话。这种类型的语音交互导致频繁的语音重叠,这在人与人之间的对话中很常见。为了使这种类型的语音交互在人与口语对话系统之间发生,从而导致类似人的连续且更流畅的人机交互,我们提出了一种在线分类器,该分类器可以将传入的语音分类为“听众响应”。我们表明,有可能使用最大等待时间阈值100–500 ms来检测声音听众响应,从而通过使用基于能量的语音活动检测器来获得34%到28%的相等错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号