Mask estimation and imputation methods for missing data speech recognition in a multisource reverberant environment

Sami Keronen; Heikki Kallasjoki; Ulpu Remes; Guy J. Brown; Jort F. Gemmeke; Kalle J. Palomaeki

首页> 外文期刊>Computer speech and language >Mask estimation and imputation methods for missing data speech recognition in a multisource reverberant environment

【24h】

Mask estimation and imputation methods for missing data speech recognition in a multisource reverberant environment

机译：多源混响环境中用于丢失数据语音识别的模板估计和归类方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present an automatic speech recognition system that uses a missing data approach to compensate for challenging environmental noise containing both additive and convolutive components. The unreliable and noise-corrupted ("missing") components are identified using a Gaussian mixture model (GMM) classifier based on a diverse range of acoustic features. To perform speech recognition using the partially observed data, the missing components are substituted with clean speech estimates computed using both sparse imputation and cluster-based GMM imputation. Compared to two reference mask estimation techniques based on inter-aural level and time difference-pairs, the proposed missing data approach significantly improved the keyword accuracy rates in all signal-to-noise ratio conditions when evaluated on the CHiME reverberant multisource environment corpus. Of the imputation methods, cluster-based imputation was found to outperform sparse imputation. The highest keyword accuracy was achieved when the system was trained on imputed data, which made it more robust to possible imputation errors.

机译：我们提出了一种自动语音识别系统，该系统使用丢失的数据方法来补偿具有挑战性的同时包含加性和卷积性成分的环境噪声。基于各种声学特征，使用高斯混合模型（GMM）分类器来识别不可靠且受噪声破坏的（“缺失”）组件。为了使用部分观察到的数据执行语音识别，将缺少的部分替换为使用稀疏插值和基于聚类的GMM插值计算的干净语音估计。与基于听觉水平和时间差对的两种参考掩模估计技术相比，在CHiME混响多源环境语料库上进行评估时，所提出的缺失数据方法在所有信噪比条件下均显着提高了关键字准确率。在插补方法中，发现基于簇的插补优于稀疏插补。当对插补数据进行系统训练时，关键字的准确性最高，这使其对可能的插补错误更加健壮。

著录项

来源
《Computer speech and language》 |2013年第3期|798-819|共22页
作者
Sami Keronen; Heikki Kallasjoki; Ulpu Remes; Guy J. Brown; Jort F. Gemmeke; Kalle J. Palomaeki;
展开▼
作者单位

Aalto University School of Science, Department of Information and Computer Science, PO Box 15400, Fl-00076 Aalto, Finland;

Aalto University School of Science, Department of Information and Computer Science, PO Box 15400, Fl-00076 Aalto, Finland;

Aalto University School of Science, Department of Information and Computer Science, PO Box 15400, Fl-00076 Aalto, Finland;

University of Sheffield, Department of Computer Science, Regent Court, 211 Portobello St., Sheffield SI 4DP, UK;

KU Leuven, Department ESAT-PSI, Kasteelpark Arenberg 10, 3001 Heverlee, Belgium;

Aalto University School of Science, Department of Information and Computer Science, PO Box 15400, Fl-00076 Aalto, Finland;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
noise robust; speech recognition; missing data; binaural; multicondition; imputation;

机译：噪音强语音识别;缺失数据;双耳多条件归责;

相似文献

外文文献
中文文献
专利

1. Mask estimation for missing data speech recognition based on statistics of binaural interaction [J] . Harding S., Barker J., Brown G.J. IEEE transactions on audio, speech and language processing . 2006,第1期

机译：基于双耳互动统计的漏失数据语音识别模板估计
2. A Novel Mask Estimation Method Employing Posterior-Based Representative Mean Estimate for Missing-Feature Speech Recognition [J] . Wooil Kim, Hansen J.H.L. Audio, Speech, and Language Processing, IEEE Transactions on . 2011,第5期

机译：一种基于后验的代表性均值估计的新的掩模估计方法用于特征缺失语音识别
3. New Missing Features Mask Estimation Method for Speaker Recognition in Noisy Environments [J] . Dayana Ribas González, José Ramón Calvo de Lara Revista de Ingeniería Electrónica, Automática y Comunicaciones . 2012,第2期

机译：嘈杂环境中说话人识别的新缺失特征掩模估计方法
4. Binaural cues for fragment-based speech recognition in reverberant multisource environments [C] . Ning Ma, Jon Barker, Heidi Christensen, Annual conference of the International Speech Communication Association;INTERSPEECH 2011 . 2011

机译：混响多源环境中基于片段的语音识别的双耳线索
5. Evaluating Multiple Imputation Methods for Longitudinal Healthy Aging Index—A Score Variable with Data Missing Due to Death, Dropout and Several Missing Data Mechanisms [D] . Kane, Elizabeth L. 2017

机译：纵向健康老龄化指数的多种估算方法的评估-一个因死亡，辍学和几种缺失数据机制导致数据缺失的得分变量
6. The impact of reverberant self-masking and overlap-masking effects on speech intelligibility by cochlear implant listeners (L) [O] . Kostas Kokkinakis, Philipos C. Loizou -1

机译：混响自掩蔽和重叠掩蔽效应对人工耳蜗植入听众语音清晰度的影响（L）
7. Mask estimation and imputation methods for missing data speech recognition in a multisource reverberant environment [O] . Keronen Sami, Kallasjoki Heikki, Remes Ulpu, 2013

机译：多源混响环境中用于丢失数据语音识别的模板估计和归类方法

Mask estimation and imputation methods for missing data speech recognition in a multisource reverberant environment

摘要

著录项

相似文献

相关主题

期刊订阅