BIMODAL SPEECH RECOGNITION

机译：双峰语音识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes a bimodal speech recognition system based on features obtained from the speech signal and from the image of the speaker. The main advantage of the proposed speech recognition system is the robustness of the recognition rates. The robustness means that the recognition rates do not change when the speech signal is degraded with artificial noise. In order to implement the bimodal system we combined the features obtained from two sources (speech and image) than we construct bimodal models that became the actual pattern to be recognized. For the classification stage we used a statistical approach called Support Vector Machines (SVMs) extended to multiciass decision. For speech analysis a perceptual technique of the linear prediction method was applied (PLP) and in order to extract geometric features from the speaker image a face tracking algorithm was used based on GMM. The results that we obtained confirm the improvement that can be achieved especially in noisy environments.

机译：本文介绍了一种基于从语音信号和说话者图像获得的特征的双峰语音识别系统。所提出的语音识别系统的主要优点是识别率的鲁棒性。健壮性意味着当语音信号由于人工噪声而下降时，识别率不会改变。为了实现双峰系统，我们结合了从两个来源（语音和图像）获得的特征，而不是构建了成为实际模式的双峰模型。在分类阶段，我们使用了一种统计方法，称为支持向量机（SVM），扩展至多面决策。对于语音分析，应用了线性预测方法（PLP）的感知技术，并且为了从说话者图像中提取几何特征，使用了基于GMM的面部跟踪算法。我们获得的结果证实了可以实现的改进，尤其是在嘈杂的环境中。

著录项

来源
《International Conference on Communications vol.1; 20040603-05; Bucharest(RO)》|2004年|P.225-228|共4页
会议地点 Bucharest(RO)
作者
GABRIEL COSTACHE; CLAUDIA IANCU; INGE GAVAT;
展开▼
作者单位

Politehnica University of Bucharest, Faculty of Electronics and Telecommunications, Bd Iuliu Maniu 1-3 77202 Bucharest Romania;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类通信;
关键词

相似文献

外文文献
中文文献
专利

1. Effect of (Mis)Matched Compression Speed on Speech Recognition in Bimodal Listeners [J] . Dimitar Spirrov, Eugen Kludt, Eline Verschueren, Trends in Hearing . 2020,第1期

机译：（MIS）匹配压缩速度对双峰听众语音识别的影响
2. The Effect of Hearing Aid Bandwidth and Configuration of Hearing Loss on Bimodal Speech Recognition in Cochlear Implant Users [J] . Neuman Arlene C., Zeman Annette, Neukam Jonathan, Ear and hearing. . 2019,第3期

机译：助听器带宽和助听损失在耳蜗植入物中的助听损失的影响
3. Block Energy Based Visual Features Using Histogram Of Oriented Gradient For Bimodal Hindi Speech Recognition [J] . Prashant Upadhyaya, Omar Farooq, M.R. Abidi Procedia Computer Science . 2018,第1期

机译：基于方向梯度直方图的基于块能量的视觉特征用于双峰印地语语音识别
4. Audio-visual automatic speech recognition and related bimodal speech technologies: A review of the state-of-the-art and open problems [C] . Potamianos Gerasimos Automatic Speech Recognition amp; Understanding, 2009. ASRU 2009 . 2009

机译：视听自动语音识别和相关的双峰语音技术：最新技术和开放性问题的回顾
5. Recognition and localization of speech by adult cochlear implant recipients wearing a digital hearing aid in the non-implanted ear (bimodal hearing) [D] . Potts, Lisa G. 2006

机译：成年人工耳蜗植入者在非植入式耳朵中佩戴数字助听器的语音识别和定位（双峰听力）
6. Effect of (Mis)Matched Compression Speed on Speech Recognition in Bimodal Listeners [O] . Dimitar Spirrov, Eugen Kludt, Eline Verschueren, 2020

机译：（MIS）匹配压缩速度对双峰听众语音识别的影响
7. Bimodal Emotion Recognition from Speech and Text [O] . Weilin Ye, Xinghua Fan 2014

机译：语音和文本的双峰情感识别
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

BIMODAL SPEECH RECOGNITION

摘要

著录项

相似文献

相关主题

期刊订阅