...
首页> 外文期刊>Computer speech and language >Multiple cameras for audio-visual speech recognition in an automotive environment
【24h】

Multiple cameras for audio-visual speech recognition in an automotive environment

机译:多个摄像头,用于汽车环境中的视听语音识别

获取原文
获取原文并翻译 | 示例
           

摘要

Audio-visual speech recognition, or the combination of visual lip-reading with traditional acoustic speech recognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visual speech recognition literature to show that further improvements in speech recognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visual speech recognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotive audio-visual speech database. We study the relative contribution between the side and central orientated cameras in improving visual speech recognition accuracy. Finally combination of the four visual streams with a single audio stream in a five-stream SHMM demonstrates a relative improvement of over 56% in word recognition accuracy when compared to the acoustic-only approach in the noisiest conditions of the AVICAR database.
机译:先前已显示,视听语音识别或视觉唇读与传统声学语音识别的结合可提供比嘈杂环境(如汽车车厢中存在)的纯声学方法的显着改进。本文介绍的研究将在已建立的视听语音识别文献上进行扩展,以显示当讲话者面部的多个正面或近正面视图可用时,可以进一步提高语音识别的准确性。在四摄像头AVICAR汽车视听语音数据库上,进行了使用四流视觉同步隐藏马尔可夫模型(SHMM)的一系列视觉语音识别实验。我们研究了侧面和中央定向相机之间在提高视觉语音识别精度方面的相对贡献。最终,在AVICAR数据库的最嘈杂条件下,与仅采用声音的方法相比,在五个流的SHMM中将四个视觉流与单个音频流结合起来,可显示出56%以上的单词识别准确度。

著录项

  • 来源
    《Computer speech and language》 |2013年第4期|911-927|共17页
  • 作者单位

    Speech, Audio, Image and Video Technology Lab, Queensland University of Technology, Australia;

    Speech, Audio, Image and Video Technology Lab, Queensland University of Technology, Australia;

    Speech, Audio, Image and Video Technology Lab, Queensland University of Technology, Australia;

    Speech, Audio, Image and Video Technology Lab, Queensland University of Technology, Australia,isney Research Pittsburgh, USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    AVASR; AVICAR database; speech recognition; multi-stream HMM; automotive environment;

    机译:AVAST;VICAR数据库;语音识别;多流HMM;汽车环境;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号