首页> 外文期刊>Signal processing >Efficient online target speech extraction using DOA-constrained independent component analysis of stereo data for robust speech recognition
【24h】

Efficient online target speech extraction using DOA-constrained independent component analysis of stereo data for robust speech recognition

机译:使用DOA约束的立体声数据独立分量分析进行有效的在线目标语音提取,以实现可靠的语音识别

获取原文
获取原文并翻译 | 示例
           

摘要

This paper describes an efficient online target-speech-extraction method used as a preprocessing step for robust automatic speech recognition (ASR). Because a target speaker is located relatively close to microphones in many ASR applications, acoustic paths to microphones are moderately reverberant, and the target speaker direction can easily be estimated. In this situation, noise estimation is effectively performed by forming a directional null to the target speaker. Required weights for extracting target speech, independent of the estimated noise, are then determined using an adaptation rule derived from a modified version of the cost function for independent component analysis (ICA), while retaining the minimal distortion principle. In particular, an online natural-gradient learning rule with a nonholonomic constraint and normalization by a smoothed power estimate of the input signal is derived for stable convergence, even for dynamically changing speech levels, with much less computational complexity than conventional ICA. Furthermore, stereo mixtures are considered as input data for further reduction of computational loads and fast convergence. Although the method may suffer from the underdetermined problem, the weights are adapted to obtain signal-to-noise-ratio-maximization beamformers for successful target speech estimation. The experimental results obtained for various conditions demonstrate the effectiveness of the proposed method.
机译:本文介绍了一种有效的在线目标语音提取方法,该方法用作鲁棒自动语音识别(ASR)的预处理步骤。由于在许多ASR应用中目标扬声器的位置相对靠近麦克风,因此麦克风的声学路径会产生适度的回响,因此可以轻松估算目标扬声器的方向。在这种情况下,通过形成目标说话人的定向零点来有效地执行噪声估计。然后,使用自适应规则确定提取目标语音所需的权重,而与估计的噪声无关,该自适应规则是从成本函数的修改版本中得出的,用于独立成分分析(ICA),同时保留了最小失真原理。尤其是,获得了具有非完整约束并通过输入信号的平滑功率估计进行归一化的在线自然梯度学习规则,以实现稳定的收敛,即使是动态更改语音级别,其计算复杂度也比传统ICA小得多。此外,立体声混合被视为输入数据,以进一步减少计算量并实现快速收敛。尽管该方法可能会遇到不确定的问题,但是权重适用于获得信噪比最大化波束形成器,以成功进行目标语音估计。在各种条件下获得的实验结果证明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号