Efficient online target speech extraction using DOA-constrained independent component analysis of stereo data for robust speech recognition

Minook Kim; Hyung-Min Park

首页> 外文期刊>Signal processing >Efficient online target speech extraction using DOA-constrained independent component analysis of stereo data for robust speech recognition

【24h】

Efficient online target speech extraction using DOA-constrained independent component analysis of stereo data for robust speech recognition

机译：使用DOA约束的立体声数据独立分量分析进行有效的在线目标语音提取，以实现可靠的语音识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes an efficient online target-speech-extraction method used as a preprocessing step for robust automatic speech recognition (ASR). Because a target speaker is located relatively close to microphones in many ASR applications, acoustic paths to microphones are moderately reverberant, and the target speaker direction can easily be estimated. In this situation, noise estimation is effectively performed by forming a directional null to the target speaker. Required weights for extracting target speech, independent of the estimated noise, are then determined using an adaptation rule derived from a modified version of the cost function for independent component analysis (ICA), while retaining the minimal distortion principle. In particular, an online natural-gradient learning rule with a nonholonomic constraint and normalization by a smoothed power estimate of the input signal is derived for stable convergence, even for dynamically changing speech levels, with much less computational complexity than conventional ICA. Furthermore, stereo mixtures are considered as input data for further reduction of computational loads and fast convergence. Although the method may suffer from the underdetermined problem, the weights are adapted to obtain signal-to-noise-ratio-maximization beamformers for successful target speech estimation. The experimental results obtained for various conditions demonstrate the effectiveness of the proposed method.

机译：本文介绍了一种有效的在线目标语音提取方法，该方法用作鲁棒自动语音识别（ASR）的预处理步骤。由于在许多ASR应用中目标扬声器的位置相对靠近麦克风，因此麦克风的声学路径会产生适度的回响，因此可以轻松估算目标扬声器的方向。在这种情况下，通过形成目标说话人的定向零点来有效地执行噪声估计。然后，使用自适应规则确定提取目标语音所需的权重，而与估计的噪声无关，该自适应规则是从成本函数的修改版本中得出的，用于独立成分分析（ICA），同时保留了最小失真原理。尤其是，获得了具有非完整约束并通过输入信号的平滑功率估计进行归一化的在线自然梯度学习规则，以实现稳定的收敛，即使是动态更改语音级别，其计算复杂度也比传统ICA小得多。此外，立体声混合被视为输入数据，以进一步减少计算量并实现快速收敛。尽管该方法可能会遇到不确定的问题，但是权重适用于获得信噪比最大化波束形成器，以成功进行目标语音估计。在各种条件下获得的实验结果证明了该方法的有效性。

著录项

来源
《Signal processing》 |2015年第12期|126-137|共12页
作者
Minook Kim; Hyung-Min Park;
展开▼
作者单位

Department of Electronic Engineering, Sogang University, 35 Baekbeom-ro, Mapo-gu, Seoul 121-742, Republic of Korea;

Department of Electronic Engineering, Sogang University, 35 Baekbeom-ro, Mapo-gu, Seoul 121-742, Republic of Korea;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Target speech extraction; Robust speech recognition; Independent component analysis; Direction of arrival; Online adaptation;

机译：目标语音提取;强大的语音识别;独立成分分析;到达方向;在线改编;

相似文献

外文文献
中文文献
专利

1. Independent component analysis applied to feature extraction for robust automatic speech recognition [J] . Potamitis L., Fakotakis N. Electronics Letters . 2000,第23期

机译：独立成分分析应用于特征提取以实现强大的自动语音识别
2. An Energy-Efficient Speech-Extraction Processor for Robust User Speech Recognition in Mobile Head-Mounted Display Systems [J] . Jinmook Lee, Seongwook Park, Injoon Hong, Circuits and Systems II: Express Briefs, IEEE Transactions on . 2017,第4期

机译：高效节能的语音提取处理器，可在移动式头戴式显示系统中实现可靠的用户语音识别
3. Complex-Valued Independent Component Analysis for Online Blind Speech Extraction [J] . Sallberg B., Grbic N., Claesson I. IEEE transactions on audio, speech and language processing . 2008,第8期

机译：在线盲语音提取的复值独立分量分析
4. Data-driven temporal processing using independent component analysis for robust speech recognition [C] . Junhui Zhao, Jingming Kuang, Xiang Xie . 2004

机译：使用独立成分分析的数据驱动时间处理，可实现强大的语音识别
5. independent component analysis of event-related electroencephalography during speech and non-speech discrimination: implications for the sensorimotor mu rhythm in speech processing. [D] . Bowers, Andrew Lee. 2012

机译：事件相关脑电图在语音和非语音辨别过程中的独立成分分析：对语音处理中感觉运动节奏的影响。
6. Suppression of the µ Rhythm during Speech and Non-Speech Discrimination Revealed by Independent Component Analysis: Implications for Sensorimotor Integration in Speech Processing [O] . Andrew Bowers, Tim Saltuklaroglu, Ashley Harkrider, -1

机译：独立分量分析揭示了语音和非语音辨别过程中μ节律的抑制：语音处理中感觉运动整合的意义
7. Complex-Valued Independent Component Analysis for Online Blind Speech Extraction [O] . Sällberg, Benny, Grbic, Nedelko, Claesson, Ingvar 2008

机译：在线盲语音提取的复值独立分量分析

Efficient online target speech extraction using DOA-constrained independent component analysis of stereo data for robust speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅