首页> 外文会议>International Symposium on Chinese Spoken Language Processing >Audio visual speech recognition based on multi-stream DBN models with Articulatory Features
【24h】

Audio visual speech recognition based on multi-stream DBN models with Articulatory Features

机译:基于多流DBN模型的音频视觉语音识别,具有关节特征

获取原文

摘要

We present a multi-stream Dynamic Bayesian Network model with Articulatory Features (AF_AV_DBN) for audio visual speech recognition. Conditional probability distributions of the nodes are defined considering the asynchronies between the articulatory features (AFs). Speech recognition experiments are carried out on an audio visual connected digit database. Results show that comparing with the state synchronous DBN model (SS_DBN) and state asynchronous DBN model (SA_DBN), when the asynchrony constraint between the AFs is appropriately set, the AF_AV_DBN model gets the highest recognition rates, with average recognition rate improved to 89.38% from 87.02% of SS_DBN and 88.32% of SA_DBN. Moreover, the audio visual multi-stream AF_AV_DBN model greatly improves the robustness of the audio only AF_A_DBN model, for example, under the noise of −10dB, the recognition rate is improved from 20.75% to 76.24%.
机译:我们介绍了一种多流动态贝叶斯网络模型,具有铰接特征(AF_AV_DBN),用于视听语音识别。考虑剖视特征(AFS)之间的异步定义节点的条件概率分布。语音识别实验在音频视觉连接数字数据库上执行。结果表明,与状态同步DBN模型(SS_DBN)和状态异步DBN模型(SA_DBN)进行比较,当适当设置AFS之间的异步约束时,AF_AV_DBN模型获得最高识别率,平均识别率提高到89.38%从SS_DBN的87.02%和88.32%的SA_DBN。此外,音频视觉多流AF_AV_DBN模型大大提高了AUD OFF_A_DBN模型的鲁棒性,例如,在-10dB的噪声下,识别率从20.75%提高到76.24%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号