Robust several-speaker speech recognition with highly dependable online speaker adaptation and identification

Po-Yi Shih; Po-Chuan Lin; Jhing-Fa Wang; Yuan-Ning Lin

首页> 外文期刊>Journal of network and computer applications >Robust several-speaker speech recognition with highly dependable online speaker adaptation and identification

【24h】

Robust several-speaker speech recognition with highly dependable online speaker adaptation and identification

机译：强大的多说话者语音识别功能以及高度可靠的在线说话者自适应和识别功能

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The currently adaptive mechanisms adapt a single acoustic model for a speaker in speaker-independent speech recognition system. However, as more users use the same speech recognizer, single acoustic model adaptation leads to negative adaptation upon switching between users. Such a situation is problematic (undependable adaptation). This paper, considering the situation of a smart home or an office with staff members, presents the speaker-specific acoustic model adaptation based on a multi-model mechanism, to solve the problem of undependable adaptation. First, the identification of the current speaker is confirmed using the SVM classifier, then the corresponding acoustic parameters are extracted and integrated with the speaker-independent acoustic model to yield the speaker-dependent acoustic model and speech recognition accuracy then be promoted for the current speaker. To provide dependable adaptation data to achieve online positive speaker adaptation, a mechanism that measures confidence score is designed to verify each recognition result and determined whether it can be an adaptation datum. The experimental results indicate that the proposed system can effectively increase the average speech recognition accuracy from 62% to 85%. Thus, the proposed system can achieve robust several-speaker speech recognition with highly dependable online speaker adaptation and identification.

机译：当前的自适应机制在独立于说话者的语音识别系统中为说话者适应单个声学模型。但是，随着更多的用户使用相同的语音识别器，单个声学模型的适应会导致用户之间切换时产生负面的适应。这种情况是有问题的（适应性很差）。本文考虑了智能家居或职员办公的情况，提出了一种基于多模型机制的说话人特定的声学模型自适应方法，以解决自适应性不可靠的问题。首先，使用SVM分类器确认当前说话人的身份，然后提取相应的声学参数，并将其与独立于说话人的声学模型集成，以得出独立于说话人的声学模型，然后提高当前说话人的语音识别精度。为了提供可靠的适应数据以实现在线正说话者适应，设计了一种用于测量置信度得分的机制，以验证每个识别结果并确定其是否可以作为适应数据。实验结果表明，该系统可以有效地将平均语音识别准确率从62％提高到85％。因此，所提出的系统可以通过高度可靠的在线说话者自适应和识别来实现鲁棒的多个说话者语音识别。

著录项

来源
《Journal of network and computer applications》 |2011年第5期|p.1459-1467|共9页
作者
Po-Yi Shih; Po-Chuan Lin; Jhing-Fa Wang; Yuan-Ning Lin;
展开▼
作者单位

Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan;

Department of Electronics Engineering and Computer Science, Tung Fang Institute of Technology, Kaohsiung, Taiwan;

Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan;

Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
speech recognition; speaker adaptation; speaker identification; dependable adaptation; confidence score;

机译：语音识别;说话人适应说话人识别;可靠的适应;置信度;

相似文献

外文文献
中文文献
专利

1. Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition [J] . Zhen Huang, Sabato Marco Siniscalchi, Chin-Hui Lee Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2017,第1期

机译：用于语音识别的深度模型中激活函数参数的贝叶斯无监督批处理和在线说话者自适应
2. Unsupervised Speaker Adaptation for Robust Speech Recognition in Real Environments [J] . Shingo Yamade, Akira Baba, Shinichi Yoshikawa, Electronics and Communications in Japan. Part 2, Electronics . 2005,第8期

机译：无监督说话人适应，可在真实环境中实现可靠的语音识别
3. Noise robust speech recognition applied to unsupervised speaker adaptation [J] . Shingo Yamade, Akinobu Lee, Hiroshi Saruwatari, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2002,第527期

机译：适用于无监督说话者适应的抗噪语音识别
4. Utterance-Wise Recurrent Dropout and Iterative Speaker Adaptation for Robust Monaural Speech Recognition [C] . Peidong Wang, DeLiang Wang IEEE International Conference on Acoustics, Speech and Signal Processing . 2018

机译：稳健的单声道语音识别的话语 - 明智的反复辍学和迭代扬声器适应
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition [O] . Sridhar Krishna Nemala, Kailash Patil, Mounya Elhilali -1

机译：识别消息和使者：仿生频谱分析可增强语音和说话者识别能力
7. Speech accent identification and speech recognition enhancement by speaker accent adaptation [O] . Mohammad Tanabian -1

机译：扬声器口音适配的语音口音识别和语音识别增强
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Robust several-speaker speech recognition with highly dependable online speaker adaptation and identification

摘要

著录项

相似文献

相关主题

期刊订阅