Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition

Myungjong Kim; Younggwan Kim; Joohong Yoo; Jun Wang; Hoirin Kim

首页> 外文期刊>IEEE transactions on neural systems and rehabilitation engineering >Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition

【24h】

Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition

机译：KL-HMM的正则化说话人适应用于音调异常语音识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper addresses the problem of recognizing the speech uttered by patients with dysarthria, which is a motor speech disorder impeding the physical production of speech. Patients with dysarthria have articulatory limitation, and therefore, they often have trouble in pronouncing certain sounds, resulting in undesirable phonetic variation. Modern automatic speech recognition systems designed for regular speakers are ineffective for dysarthric sufferers due to the phonetic variation. To capture the phonetic variation, Kullback-Leibler divergence-based hidden Markov model (KL-HMM) is adopted, where the emission probability of state is parameterized by a categorical distribution using phoneme posterior probabilities obtained from a deep neural network-based acoustic model. To further reflect speaker-specific phonetic variation patterns, a speaker adaptation method based on a combination of L2 regularization and confusion-reducing regularization, which can enhance discriminability between categorical distributions of the KL-HMM states while preserving speaker-specific information is proposed. Evaluation of the proposed speaker adaptation method on a database of several hundred words for 30 speakers consisting of 12 mildly dysarthric, 8 moderately dysarthric, and 10 non-dysarthric control speakers showed that the proposed approach significantly outperformed the conventional deep neural network-based speaker adapted system on dysarthric as well as non-dysarthric speech.

机译：本文解决了识别构音障碍患者语音的问题，构音障碍是一种运动性语音障碍，阻碍了语音的物理产生。具有构音障碍的患者具有发音限制，因此，他们在发音某些声音时经常遇到麻烦，从而导致不良的语音变化。专为普通说话者设计的现代自动语音识别系统由于音素变化而对构音障碍患者无效。为了捕获语音变化，采用基于Kullback-Leibler散度的隐马尔可夫模型（KL-HMM），其中状态的发射概率通过分类分布使用从基于深度神经网络的声学模型获得的音素后验概率通过分类分布进行参数化。为了进一步反映说话人特定的语音变化模式，提出了一种基于L2正则化和减少混淆的正则化的说话人自适应方法，该方法可以在保留说话人特定信息的同时，增强KL-HMM状态的分类分布之间的可辨性。对由30位由12位轻度构音障碍，8位中度构音障碍和10位非构音障碍控制的发言者组成的数百个单词的数据库进行的数百个词的数据库中对拟议的发言者适应方法的评估表明，该提议的方法明显优于传统的基于深度神经网络的发言者适应构音和非构音的语音系统。

著录项

来源
《IEEE transactions on neural systems and rehabilitation engineering》 |2017年第9期|1581-1591|共11页
作者
Myungjong Kim; Younggwan Kim; Joohong Yoo; Jun Wang; Hoirin Kim;
展开▼
作者单位

Department of Bioengineering, University of Texas at Dallas, Richardson, TX, USA;

School of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea;

School of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea;

Department of Bioengineering, University of Texas at Dallas, Richardson, TX, USA;

School of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Hidden Markov models; Speech; Adaptation models; Speech recognition; Acoustics; Computational modeling; Silicon;

机译：隐马尔可夫模型;语音;适应模型;语音识别;声学;计算模型;硅;

相似文献

外文文献
中文文献
专利

1. Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System [J] . Dhanalakshmi M., Celin T. A. Mariya, Nagarajan T., Circuits, systems, and signal processing . 2018,第2期

机译：基于HMM的语音识别和自适应合成系统的韵律演讲者的语音输入语音输出通信
2. Difficulties in Automatic Speech Recognition of Dysarthric Speakers and Implications for Speech-Based Applications Used by the Elderly: A Literature Review [J] . Victoria Young MHSca Alex Mihailidis PhDa* Assistive Technology: The Official Journal of RESNA . 2010,第2期

机译：扬声器异常语音自动识别的困难及其对老年人使用基于语音的应用的启示：文献综述
3. Speech Clarity Index (Ψ): A Distance-based Speech Quality Indicator And Recognition Rate Prediction For Dysarthric Speakers With Cerebral Palsy [J] . Prakasith KAYASITH, Thanaruk THEERAMUNKONG IEICE Transactions on Information and Systems . 2009,第3期

机译：语音清晰度指数（Ψ）：基于距离的语音麻痹性说话者说话者的语音质量指标和识别率预测
4. An Investigation of End-to-End Speech Recognition Using Model Adaptation for Dysarthric Speakers [C] . Yuya Sawa, Ryoichi Takashima, Tetsuya Takiguchi IEEE Global Conference on Consumer Electronics . 2020

机译：扰动扬声器模型适应对端到端语音识别的调查
5. Alternative regularized neural network architectures for speech and speaker recognition. [D] . Garimella, Sri Venkata Surya. 2012

机译：用于语音和说话者识别的替代正则化神经网络架构。
6. Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition [O] . Myungjong Kim, Younggwan Kim, Joohong Yoo, -1

机译：KL-HMM的正则化说话人适应用于音调异常语音识别
7. Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition [O] . Myungjong Kim, Younggwan Kim, Joohong Yoo, 2017

机译：用于扰动语音识别的KL-HMM的正规扬声器适应
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅