Audio visual speech recognition based on multi-stream DBN models with Articulatory Features

机译：基于多流DBN模型的音频视觉语音识别，具有关节特征

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a multi-stream Dynamic Bayesian Network model with Articulatory Features (AF_AV_DBN) for audio visual speech recognition. Conditional probability distributions of the nodes are defined considering the asynchronies between the articulatory features (AFs). Speech recognition experiments are carried out on an audio visual connected digit database. Results show that comparing with the state synchronous DBN model (SS_DBN) and state asynchronous DBN model (SA_DBN), when the asynchrony constraint between the AFs is appropriately set, the AF_AV_DBN model gets the highest recognition rates, with average recognition rate improved to 89.38% from 87.02% of SS_DBN and 88.32% of SA_DBN. Moreover, the audio visual multi-stream AF_AV_DBN model greatly improves the robustness of the audio only AF_A_DBN model, for example, under the noise of −10dB, the recognition rate is improved from 20.75% to 76.24%.

机译：我们介绍了一种多流动态贝叶斯网络模型，具有铰接特征（AF_AV_DBN），用于视听语音识别。考虑剖视特征（AFS）之间的异步定义节点的条件概率分布。语音识别实验在音频视觉连接数字数据库上执行。结果表明，与状态同步DBN模型（SS_DBN）和状态异步DBN模型（SA_DBN）进行比较，当适当设置AFS之间的异步约束时，AF_AV_DBN模型获得最高识别率，平均识别率提高到89.38％从SS_DBN的87.02％和88.32％的SA_DBN。此外，音频视觉多流AF_AV_DBN模型大大提高了AUD OFF_A_DBN模型的鲁棒性，例如，在-10dB的噪声下，识别率从20.75％提高到76.24％。

著录项

来源
《International Symposium on Chinese Spoken Language Processing》|2010年||共4页
会议地点
作者
{missing};
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
DBN; articulatory feature; audio-visual; speech recognition;

机译：DBN;明晰的功能;视听;语音识别;

相似文献

外文文献
中文文献
专利

1. Multistream Articulatory Feature-Based Models for Visual Speech Recognition [J] . Saenko Kate, Livescu Karen, Glass James, Pattern Analysis and Machine Intelligence, IEEE Transactions on . 2009,第9期

机译：基于多流发音特征的视觉语音识别模型
2. Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features [J] . Dongmei Jiang, Yong Zhao, Hichem Sahli, Multimedia Tools and Applications . 2014,第1期

机译：基于发音DBN模型和AAM功能的语音驱动的照片逼真的面部动画
3. Articulatory feature based continuous speech recognition using probabilistic lexical modeling [J] . Ramya Rasipuram, Mathew Magimai.-Doss Computer speech and language . 2016,第Mara期

机译：基于发音特征的概率词汇建模的连续语音识别
4. Audio visual speech recognition based on multi-stream DBN models with Articulatory Features [C] . 2010 7th International Symposium on Chinese Spoken Language Processing . 2010

机译：基于具有发音特征的多流DBN模型的视听语音识别
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion [O] . Prasanta Kumar Ghosh, Shrikanth Narayanan -1

机译：使用从独立于受试者的声学到发音反转的发音特征进行自动语音识别
7. DBN based multi-stream models for audio-visual speech recognition [O] . John N. Gowdy, Amarnag Subramanya, Chris Bartels, 2004

机译：基于DBN的多流模型用于视听语音识别

Audio visual speech recognition based on multi-stream DBN models with Articulatory Features

摘要

著录项

相似文献

相关主题

期刊订阅