首页> 外文期刊>Computer speech and language >Assessing the effect of visual servoing on the performance of linear microphone arrays in moving human-robot interaction scenarios
【24h】

Assessing the effect of visual servoing on the performance of linear microphone arrays in moving human-robot interaction scenarios

机译:评估视觉伺服对移动机器人交互场景中线性麦克风阵列性能的影响

获取原文
获取原文并翻译 | 示例
           

摘要

Social robotics is becoming a reality and voice-based human-robot interaction is essential for a successful human-robot collaborative symbiosis. The main objective of this paper is to assess the effect of visual servoing in the performance of a linear microphone array regarding distant ASR in a mobile, dynamic and non-stationary robotic testbed that can be representative of real HR1 scenarios. Visual servoing and image target tracking are different tasks, and this paper focuses on an effect that is rarely addressed in the literature: the dependence of the beamforming directivity on look direction. The datasets required to carry out the study reported here do not exist and had to be generated. A state-of-the-art mobile robotic testbed had to be set up with target speech and noise sources. A linear microphone array was chosen as a case of study and its response was measured. Standard beamforming methods were evaluated with respect to visual servoing: delay-and-sum combined with image tracking; weighted delay-and-sum; and, MVDR also combined with image tracking. The results presented here show that the performance of beamforming methods is dramatically degraded in moving and non-stationary conditions. In this context, visual servoing in HRI can significantly improve the performance of a linear microphone array regarding ASR accuracy. The average reduction in WER achieved when the robot head was steered toward the target speech source was as high as 28.2%. Finally, it is worth highlighting that the methodology adopted here is applicable to any microphone array, linear or not.
机译:社会机器人正在成为现实,基于语音的人机互动对于成功的人类机器人协作共生至关重要。本文的主要目的是评估视觉伺服在可以代表真实HR1场景中的关于远处ASR的线性麦克风阵列的性能。视觉伺服和图像目标跟踪是不同的任务,本文重点介绍了文献中很少涉及的效果:波束形成方向对视图的依赖性。执行此处报告的研究所需的数据集不存在,并且必须生成。必须使用目标语音和噪声来源建立最先进的移动机器人试验台。选择线性麦克风阵列作为研究的情况,并测量其响应。对视觉伺服进行评估标准波束成形方法:延迟和和与图像跟踪相结合;加权延迟和总和;而且,MVDR还与图像跟踪相结合。这里提出的结果表明,在移动和非静止条件下,波束形成方法的性能显着降低。在这种情况下,HRI中的视觉伺服可以显着提高关于ASR精度的线性麦克风阵列的性能。当机器人头朝向目标语音源转向时,达到的平均减少高达28.2%。最后,值得突出显示此处采用的方法适用于任何麦克风阵列,线性与否。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号