Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network

机译：用单通道时域增强网络提高噪声鲁棒自动语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the advent of deep learning, research on noise-robust automatic speech recognition (ASR) has progressed rapidly. However, ASR performance in noisy conditions of single-channel systems remains unsatisfactory. Indeed, most single-channel speech enhancement (SE) methods (denoising) have brought only limited performance gains over state-of-the-art ASR back-end trained on multi-condition training data. Recently, there has been much research on neural network-based SE methods working in the time-domain showing levels of performance never attained before. However, it has not been established whether the high enhancement performance achieved by such time-domain approaches could be translated into ASR. In this paper, we show that a single-channel time-domain denoising approach can significantly improve ASR performance, providing more than 30 % relative word error reduction over a strong ASR back-end on the real evaluation data of the single-channel track of the CHiME-4 dataset. These positive results demonstrate that single-channel noise reduction can still improve ASR performance, which should open the door to more research in that direction.

机译：随着深度学习的出现，对噪声稳健的自动语音识别（ASR）的研究已经迅速发展。但是，单通道系统嘈杂条件下的ASR性能仍然不令人满意。实际上，大多数单通道语音增强（SE）方法（去噪）在多条件训练数据上训练的最先进的ASR后端训练的性能收益有限。最近，在时间域中工作的基于神经网络的SE方法已经有多若干研究，显示了之前从未获得的性能水平。但是，尚未确定是否可以将这种时间域方法达到的高增强性能转换为ASR。在本文中，我们表明，单通道时域去噪方法可以显着提高ASR性能，在单通道轨道的实际评估数据中提供超过30％的相对词误差减少Chime-4数据集。这些阳性结果表明，单通道降噪仍然可以提高ASR性能，这应该在该方向上打开更多的研究。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|p6824-7443|共5页
会议地点
作者
Keisuke Kinoshita; Tsubasa Ochiai; Marc Delcroix; Tomohiro Nakatani;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
Single-channel speech enhancement; time-domain network; robust ASR;

机译：单通道语音增强;时间域网络;鲁棒ASR;

相似文献

外文文献
中文文献
专利

1. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition [J] . Shimada Kazuki, Bando Yoshiaki, Mimura Masato, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第5期

机译：基于多通道NMF信息波束形成的无监督语音增强技术，用于强噪声自动语音识别
2. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition [J] . Shimada Kazuki, Bando Yoshiaki, Mimura Masato, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第5期

机译：基于多通道NMF的噪声强度自动语音识别的无监督语音增强
3. Multi-Channel Speech Enhancement and Amplitude Modulation Analysis for Noise Robust Automatic Speech Recognition [J] . Moritz Niko, Adiloǧlu Kamil, Anemüller Jörn, Computer speech and language . 2017,第nova期

机译：噪声鲁棒自动语音识别的多通道语音增强和幅度调制分析
4. Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network [C] . Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, . 2020

机译：利用单通道时域增强网络提高抗噪能力强的自动语音识别
5. Compressive nonlinearity for representing speech spectral magnitude to improve noise robustness of automatic speech recognition . [D] . Wong, Brian. 2011

机译：压缩非线性表示语音频谱幅度提高语音自动识别的鲁棒性。
6. Speech enhancement based on neural networks improves speech intelligibility in noise for cochlear implant users [O] . Tobias Goehring, Federico Bolner, Jessica J.M. Monaghan, -1

机译：基于神经网络的语音增强功能可提高人工耳蜗用户的语音清晰度
7. Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network [O] . Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, 2020

机译：用单通道时域增强网络提高噪声鲁棒自动语音识别

Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network

摘要

著录项

相似文献

相关主题

期刊订阅